VOOZH about

URL: https://huggingface.co/unsloth/Ministral-3-14B-Reasoning-2512-GGUF

⇱ unsloth/Ministral-3-14B-Reasoning-2512-GGUF Β· Hugging Face


See our Ministral 3 collection for all versions including GGUF, 4-bit & FP8 formats.

Learn to run Ministral correctly - Read our Guide.

See Unsloth Dynamic 2.0 GGUFs for our quantization benchmarks.

✨ Read our Ministral 3 Guide here!


Ministral 3 14B Reasoning 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 32GB of VRAM in BF16, and less than 24GB of RAM/VRAM when quantized.

Key Features

Ministral 3 14B consists of two main architectural components:

  • 13.5B Language Model
  • 0.4B Vision Encoder

The Ministral 3 14B Reasoning model offers the following capabilities:

  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Use Cases

Private AI deployments where advanced capabilities meet practical hardware constraints:

  • Private/custom chat and AI assistant deployments in constrained environments
  • Advanced local agentic use cases
  • Fine-tuning and specialization
  • And more...

Bringing advanced AI capabilities to most environments.

Ministral 3 Family

Model Name Type Precision Link
Ministral 3 3B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 3B Instruct 2512 Instruct post-trained FP8 Hugging Face
Ministral 3 3B Reasoning 2512 Reasoning capable BF16 Hugging Face
Ministral 3 8B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 8B Instruct 2512 Instruct post-trained FP8 Hugging Face
Ministral 3 8B Reasoning 2512 Reasoning capable BF16 Hugging Face
Ministral 3 14B Base 2512 Base pre-trained** BF16 Hugging Face
Ministral 3 14B Instruct 2512 Instruct post-trained FP8 Hugging Face
Ministral 3 14B Reasoning 2512 Reasoning capable BF16 Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model AIME25 AIME24 GPQA Diamond LiveCodeBench
Ministral 3 14B
Qwen3-14B (Thinking) 0.737 0.837 0.663 0.593
Ministral 3 8B 0.787 0.668
Qwen3-VL-8B-Thinking 0.580
Ministral 3 3B 0.534
Qwen3-VL-4B-Thinking 0.697 0.729 0.513

Instruct

Model Arena Hard WildBench MATH Maj@1 MM MTBench
Ministral 3 14B
Qwen3 14B (Non-Thinking) 0.427 65.1 0.870 NOT MULTIMODAL
Gemma3-12B-Instruct 0.436 63.2 0.854 6.70
Ministral 3 8B 0.509 0.876
Qwen3-VL-8B-Instruct 66.3 8.00
Ministral 3 3B 0.305 0.830 7.83
Qwen3-VL-4B-Instruct
Qwen3-VL-2B-Instruct 0.163 42.2 0.786 6.36
Gemma3-4B-Instruct 0.318 49.1 0.759 5.23

Base

Model Multilingual MMLU MATH CoT 2-Shot AGIEval 5-shot MMLU Redux 5-shot MMLU 5-shot TriviaQA 5-shot
Ministral 3 14B 0.742 0.648 0.820 0.794 0.749
Qwen3 14B Base 0.620 0.703
Gemma 3 12B Base 0.690 0.487 0.587 0.766 0.745
Ministral 3 8B 0.591 0.793
Qwen 3 8B Base 0.700 0.576 0.760 0.639
Ministral 3 3B 0.652 0.511 0.735 0.707 0.592
Qwen 3 4B Base 0.405 0.530
Gemma 3 4B Base 0.516 0.294 0.430 0.626 0.589

Usage

The model can be used with the following frameworks;

vLLM

We recommend using this model with vLLM.

Installation

Make sure to install vLLM >= 0.12.0:

pip install vllm --upgrade

Doing so should automatically install mistral_common >= 1.8.6.

To check:

python -c "import mistral_common; print(mistral_common.__version__)"

You can also make use of a ready-to-go docker image or on the docker hub.

Serve

To fully exploit the Ministral-3-14B-Reasoning-2512 we recommed using 2xH200 GPUs for deployment due to its large context. However if you don't need a large context, you can fall back to a single GPU.

A simple launch command is:


vllm serve mistralai/Ministral-3-14B-Reasoning-2512-FP8 \
 --tensor-parallel-size 2 \
 --enable-auto-tool-choice --tool-call-parser mistral \
 --reasoning-parser mistral

Key parameter notes:

  • enable-auto-tool-choice: Required when enabling tool usage.
  • tool-call-parser mistral: Required when enabling tool usage.
  • reasoning-parser mistral: Required when enabling reasoning.

Additional flags:

  • You can set --max-model-len to preserve memory. By default it is set to 262144 which is quite large but not necessary for most scenarios.
  • You can set --max-num-batched-tokens to balance throughput and latency, higher means higher throughput but higher latency.

Usage of the model

Here we asumme that the model mistralai/Ministral-3-8B-Reasoning-2512 is served and you can ping it to the domain localhost with the port 8000 which is the default for vLLM.

Transformers

You can also use Ministral 3 3B Reasoning 2512 with Transformers ! Make sure to install Transformers from its first v5 release candidate or from "main":

pip install transformers==5.0.0rc0

To make the best use of our model with Transformers make sure to have installed mistral-common >= 1.8.6 to use our tokenizer.

pip install mistral-common --upgrade

Then load our tokenizer along with the model and generate:

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Downloads last month
8,588
GGUF
Model size
14B params
Architecture
mistral3
Hardware compatibility
Log In to add your hardware

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for unsloth/Ministral-3-14B-Reasoning-2512-GGUF

Collections including unsloth/Ministral-3-14B-Reasoning-2512-GGUF