You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

By submitting this form, you agree to the License Agreement and acknowledge that the information you provide will be collected, used, and shared in accordance with Cohere’s Privacy Policy. You’ll receive email updates about Cohere Labs and Cohere research, events, products and services. You can unsubscribe at any time.

Model Card for Cohere Labs Command A Reasoning

Model Summary

Cohere Labs Command A Reasoning is an open weights research release of a 111 billion parameter model optimized for tool use, agentic, and multilingual use cases with reasoning capabilities. The model can be used both with reasoning on for increased performance or with reasoning off for lower latency responses, using the ‘reasoning’ parameter.

Developed by: Cohere and Cohere Labs

Point of Contact: Cohere Labs
License:CC-BY-NC, requires also adhering to Cohere Lab's Acceptable Use Policy
Model: command-a-reasoning-08-2025
Model Size: 111 billion parameters
Context length: 256K

For more details about this model, please check out our blog post.

Try Cohere Labs Command A Reasoning

You can try out Cohere Labs Command A Reasoning before downloading the weights in our hosted Hugging Face Space.

Usage

Please install transformers from the source repository that includes the necessary changes for this model.

# pip install transformers accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereLabs/command-a-reasoning-08-2025"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype="auto")

messages = [
 {"role": "user", "content": "How many r's are there in strawberry?"},
]

inputs = tokenizer.apply_chat_template(
 messages,
 add_generation_prompt=True,
 tokenize=True,
 return_dict=True,
 return_tensors="pt",
 reasoning=True, # or false, to turn off reasoning
).to(model.device)


outputs = model.generate(**inputs, max_new_tokens=2048)
decoded = [tokenizer.decode(out, skip_special_tokens=False) for out in outputs]
print(decoded[0])

As a result, you should get an output that looks like this, where the thinking is generated between the <START_THINKING> and <END_THINKING>:

<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|><|START_THINKING|>Okay, let's see. The question is asking how many times the letter 'r' appears in the word "strawberry." Hmm, I need to count each occurrence of 'r' in that word. Let me start by spelling out the word to make sure I get all the letters right.
S-T-R-A-W-B-E-R-R-Y. Wait, let me check that again. S-T-R (that's the first 'r'), then A-W-B-E, and then another R-R (so two 'r's there), followed by Y. Let me write it out with positions to be sure:
1: S
2: T
3: R
4: A
5: W
6: B
7: E
8: R
9: R
10: Y
So looking at each letter, the 'r's are at positions 3, 8, and 9. That makes three 'r's in total. Wait, but sometimes people might miscount if they go too fast. Let me verify again. The word is strawberry. Breaking it down:
- The first syllable is "straw," which has an 'r' in it. So S-T-R-A-W. That's one 'r'.
- The second part is "berry." B-E-R-R-Y. In "berry," there are two 'r's. So combining both parts, that's 1 + 2 = 3 'r's.
Alternatively, maybe I can count them one by one. Let's spell it again: S-T-R-A-W-B-E-R-R-Y. So after S-T comes R (1), then A-W-B-E, then R (2), R (3), and Y. Yep, three 'r's. I think that's correct. Sometimes when letters are doubled, like the two 'r's in "berry," it's easy to count them as one if you're not careful, but here they are separate. So the answer should be three.<|END_THINKING|><|START_RESPONSE|>The word "strawberry" is spelled S-T-R-A-W-B-E-R-R-Y. Let's count the letter 'r':
1. **First 'r'**: Position 3 (S-T-**R**-A-W...)
2. **Second 'r'**: Position 8 (...B-E-**R**-...)
3. **Third 'r'**: Position 9 (...**R**-Y)
**Total**: 3 'r's.
**Answer**: There are \boxed{3} r's in "strawberry."<|END_RESPONSE|><|END_OF_TURN_TOKEN|>

Reasoning can be turned off by passing reasoning=False to apply_chat_template. The default value is True.

Model Details

Input: Text only.

Output: Model generates text.

Model Architecture: This is an auto-regressive language model that uses an optimized transformer architecture. After pretraining, this model uses supervised fine-tuning (SFT) and preference training to align model behavior to human preferences for helpfulness and safety. The model features three layers with sliding window attention (window size 4096) and RoPE for efficient local context modeling and relative positional encoding. A fourth layer uses global attention without positional embeddings, enabling unrestricted token interactions across the entire sequence.

Languages covered: The model has been trained on 23 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian. Context Length: Command A Reasoning supports a context length of 256K & 32K output length.

Tool Use Capabilities:

Command A Reasoning has been specifically trained with conversational tool use capabilities. This allows the model to interact with external tools like APIs, databases, or search engines.

Tool use with Command A Reasoning is supported through chat templates in Transformers. We recommend providing tool descriptions using JSON schema.

Model Card Contact

For errors or additional questions about details in this model card, contact labs@cohere.com

Terms of Use:

We hope that the release of this model will make community-based research efforts more accessible, by releasing the weights of a highly performant 111 billion parameter model to researchers all over the world. This model is governed by a CC-BY-NC, requires also adhering to Cohere Lab's Acceptable Use Policy If you are interested in commercial use, please contact Cohere’s Sales team.

Try Chat:

You can try Command A Reasoning in the playground here. You can also use it in our dedicated Hugging Face Space here.

Downloads last month: 1,324

Safetensors

Model size

111B params

Tensor type

BF16

Model tree for CohereLabs/command-a-reasoning-08-2025

Base model

CohereLabs/c4ai-command-a-03-2025

Finetuned

(12)

this model

Finetunes

2 models

Quantizations

8 models

Spaces using CohereLabs/command-a-reasoning-08-2025 4

Collection including CohereLabs/command-a-reasoning-08-2025

Latest Cohere Labs Command models • 12 items • Updated Nov 6, 2025 • 34

Evaluation results

Idavidrein/gpqa · Diamond View evaluation results source leaderboard
66.67 ^*

URL: https://huggingface.co/CohereLabs/command-a-reasoning-08-2025

⇱ CohereLabs/command-a-reasoning-08-2025 · Hugging Face