Devstral Small 2 24B Instruct 2512 👁 Model Icon

👁 Validated Badge

Devstral is an agentic LLM for software engineering tasks. Devstral Small 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents.
The model achieves remarkable performance on SWE-bench.

This model is an Instruct model in FP8, fine-tuned to follow instructions, making it ideal for chat, agentic and instruction based tasks for SWE use cases.

For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we invite companies to reach out to us.

ModelCar Storage URI: oci://registry.redhat.io/rhai/modelcar-devstral-small-2-24b-instruct-2512:3.0

Validated on vLLM: 0.14.1

Validated on RHAIIS: 3.4 EA1

Validated on RHOAI: 3.4 EA1

Key Features

The Devstral Small 2 Instruct model offers the following capabilities:

Agentic Coding: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents.
Lightweight: with its compact size of just 24 billion parameters, Devstral is light enough to run on a single RTX 4090 or a Mac with 32GB RAM, making it an appropriate model for local deployment and on-device use.
Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
Context Window: A 256k context window.

Updates compared to Devstral Small 1.1:

Vision Capabilities: Enables the model to analyze images and provide insights based on visual content, in addition to text.
Improved Performance: Devstral Small 2 is a step-up compared to its predecessors.
Attention Softmax Temperature: Devstral Small 2 uses the same architecture as Ministral 3 using rope-scaling as introduced by Llama 4 and Scalable-Softmax Is Superior for Attention.
Better Generalization: Generalises better to diverse prompts and coding environments.

Use Cases

AI Code Assistants, Agentic Coding, and Software Engineering Tasks. Leveraging advanced AI capabilities for complex tool integration and deep codebase understanding in coding environments.

Benchmark Results

Model/Benchmark	Size (B Parameters)	SWE Bench Verified	SWE Bench Multilingual	Terminal Bench 2
Devstral 2	123	72.2%	61.3%	32.6%
Devstral Small 2	24	68.0%	55.7%	22.5%
GLM 4.6	355	68.0%	--	24.6%
Qwen 3 Coder Plus	480	69.6%	54.7%	25.4%
MiniMax M2	230	69.4%	56.5%	30.0%
Kimi K2 Thinking	1000	71.3%	61.1%	35.7%
DeepSeek v3.2	671	73.1%	70.2%	46.4%
GPT 5.1 Codex High	--	73.7%	--	52.8%
GPT 5.1 Codex Max	--	77.9%	--	60.4%
Gemini 3 Pro	--	76.2%	--	54.2%
Claude Sonnet 4.5	--	77.2%	68.0%	42.8%

*Benchmark results presented are based on publicly reported values for competitor models.

Usage

Scaffolding

Together with Devstral 2, we are releasing Mistral Vibe, a CLI tool allowing developers to leverage Devstral capabilities directly in your terminal.

Mistral Vibe (recommended): Learn how to use it here

Devstral 2 can also be used with the following scaffoldings:

You can use Devstral 2 either through our API or by running locally.

Mistral Vibe

The Mistral Vibe CLI is a command-line tool designed to help developers leverage Devstral’s capabilities directly from their terminal.

We recommend installing Mistral Vibe using uv for faster and more reliable dependency management:

uv tool install mistral-vibe

You can also run:

curl -LsSf https://mistral.ai/vibe/install.sh | sh

If you prefer using pip, use:

pip install mistral-vibe

To launch the CLI, navigate to your project's root directory and simply execute:

vibe

If this is your first time running Vibe, it will:

Create a default configuration file at ~/.vibe/config.toml.
Prompt you to enter your API key if it's not already configured, follow these instructions to create an Account and get an API key.
Save your API key to ~/.vibe/.env for future use.

Local Deployment

The model can also be deployed with the following libraries:

We're thankful to the llama.cpp team and their community as well as the LM Studio and Ollama teams that worked hard to make these models also available for their frameworks.

You can now also run Devstral using these (alphabetical ordered) frameworks:

llama.cpp: To use community ones such as Unsloth's or Bartowski's make sure to use changes from this PR.
LM Studio: https://lmstudio.ai/models/devstral-2
Ollama: https://ollama.com/library/devstral-small-2

If you notice subpar performance with local serving, please submit issues to the relevant framework so that it can be fixed and in the meantime we advise to use the Mistral AI API.

vLLM (recommended)

SGLang

Transformers

Tests

To help test our model via vLLM or test that other frameworks' implementations are correct, here is a set of prompts you can try with the expected outputs.

Call one tool

Call tools one at a time subsequently

Long context

Chatting tech

Small talk

Run the examples above with the following python script which assumes there is an OpenAI compatible server deployed at localhost:8000:

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Downloads last month: 104

Safetensors

Model size

24B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RedHatAI/Devstral-Small-2-24B-Instruct-2512

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Quantized

(31)

this model

Collection including RedHatAI/Devstral-Small-2-24B-Instruct-2512

March 2026 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio. • 5 items • Updated Apr 30 • 4

Paper for RedHatAI/Devstral-Small-2-24B-Instruct-2512

Paper • 2501.19399 • Published Jan 31, 2025 • 25

URL: https://huggingface.co/RedHatAI/Devstral-Small-2-24B-Instruct-2512

⇱ RedHatAI/Devstral-Small-2-24B-Instruct-2512 · Hugging Face