VOOZH about

URL: https://huggingface.co/mistralai/Ministral-3-3B-Base-2512

⇱ mistralai/Ministral-3-3B-Base-2512 · Hugging Face


Ministral 3 3B Base 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

This model is the base pre-trained version, not fine-tuned for instruction or reasoning tasks, making it ideal for custom post-training processes.
For instruction and chat based use cases, we recommend using Ministral 3 3B Instruct 2512.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized.

Learn more in our blog post and paper.

Key Features

Ministral 3 3B consists of two main architectural components:

  • 3.4B Language Model
  • 0.4B Vision Encoder

The Ministral 3 3B Base model offers the following capabilities:

  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Use Cases

Ideal for lightweight, real-time applications on edge or low-resource devices, such as:

  • Image captioning
  • Text classification
  • Real-time efficient translation
  • Data extraction
  • Short content generation
  • Fine-tuning and specialization
  • And more...

Bringing advanced AI capabilities to edge and distributed environments for embedded systems.

Ministral 3 Family

Model Name Type Precision Link
Ministral 3 3B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 3B Instruct 2512 Instruct post-trained FP8 Hugging Face
Ministral 3 3B Reasoning 2512 Reasoning capable BF16 Hugging Face
Ministral 3 8B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 8B Instruct 2512 Instruct post-trained FP8 Hugging Face
Ministral 3 8B Reasoning 2512 Reasoning capable BF16 Hugging Face
Ministral 3 14B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 14B Instruct 2512 Instruct post-trained FP8 Hugging Face
Ministral 3 14B Reasoning 2512 Reasoning capable BF16 Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model AIME25 AIME24 GPQA Diamond LiveCodeBench
Ministral 3 14B
Qwen3-14B (Thinking) 0.737 0.837 0.663 0.593
Ministral 3 8B 0.787 0.668
Qwen3-VL-8B-Thinking 0.580
Ministral 3 3B 0.534
Qwen3-VL-4B-Thinking 0.697 0.729 0.513

Instruct

Model Arena Hard WildBench MATH Maj@1 MM MTBench
Ministral 3 14B
Qwen3 14B (Non-Thinking) 0.427 65.1 0.870 NOT MULTIMODAL
Gemma3-12B-Instruct 0.436 63.2 0.854 6.70
Ministral 3 8B 0.509 0.876
Qwen3-VL-8B-Instruct 66.3 8.00
Ministral 3 3B 0.305 0.830 7.83
Qwen3-VL-4B-Instruct
Qwen3-VL-2B-Instruct 0.163 42.2 0.786 6.36
Gemma3-4B-Instruct 0.318 49.1 0.759 5.23

Base

Model Multilingual MMLU MATH CoT 2-Shot AGIEval 5-shot MMLU Redux 5-shot MMLU 5-shot TriviaQA 5-shot
Ministral 3 14B 0.742 0.648 0.820 0.794 0.749
Qwen3 14B Base 0.620 0.703
Gemma 3 12B Base 0.690 0.487 0.587 0.766 0.745
Ministral 3 8B 0.591 0.793
Qwen 3 8B Base 0.700 0.576 0.760 0.639
Ministral 3 3B 0.652 0.511 0.735 0.707 0.592
Qwen 3 4B Base 0.405 0.530
Gemma 3 4B Base 0.516 0.294 0.430 0.626 0.589

Usage

The model can be used with the following frameworks;

vLLM

We recommend using this model with vLLM.

Installation

Make sure to install vllm >= 1.12.0:

pip install vllm --upgrade

Doing so should automatically install mistral_common >= 1.8.6.

To check:

python -c "import mistral_common; print(mistral_common.__version__)"

You can also make use of a ready-to-go docker image or on the docker hub.

Serve

Due to their size and the BF16 format of their weights Ministral-3-3B-Base-2512 and Ministral-3-8B-Base-2512 can run on a single 1xH200 GPU.

A simple launch command is:

vllm serve mistralai/Ministral-3-3B-Base-2512 \
 --tokenizer_mode mistral --config_format mistral --load_format mistral

Additional flags:

  • You can set --max-model-len to preserve memory. By default it is set to 262144 which is quite large but not necessary for most scenarios.
  • You can set --max-num-batched-tokens to balance throughput and latency, higher means higher throughput but higher latency.

Usage of the model

Here we assume that the model mistralai/Ministral-3-3B-Base-2512 is served and you can ping it to the domain localhost with the port 8000 which is the default for vLLM.

Transformers

You can also use Ministral 3 3B Base 2512 with Transformers ! Make sure to install Transformers from its first v5 release candidate or from "main":

pip install transformers==5.0.0rc0

To make the best use of our model with Transformers make sure to have installed mistral-common >= 1.8.6 to use our tokenizer.

pip install mistral-common --upgrade

Then load our tokenizer along with the model and generate:

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Downloads last month
26,913
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for mistralai/Ministral-3-3B-Base-2512

Adapters
3 models
Finetunes
42 models
Quantizations
34 models

Collection including mistralai/Ministral-3-3B-Base-2512

Paper for mistralai/Ministral-3-3B-Base-2512