VOOZH about

URL: https://huggingface.co/tencent/Youtu-LLM-2B-Base

โ‡ฑ tencent/Youtu-LLM-2B-Base ยท Hugging Face


๐ŸŽฏ Brief Introduction

Youtu-LLM is a new, small, yet powerful LLM, contains only 1.96B parameters, supports 128k long context, and has native agentic talents. On general evaluations, Youtu-LLM significantly outperforms SOTA LLMs of similar size in terms of Commonsense, STEM, Coding and Long Context capabilities; in agent-related testing, Youtu-LLM surpasses larger-sized leaders and is truly capable of completing multiple end2end agent tasks.

Youtu-LLM has the following features:

  • Type: Autoregressive Causal Language Models with Dense MLA
  • Release versions: Base and Instruct
  • Number of Parameters: 1.96B
  • Number of Layers: 32
  • Number of Attention Heads (MLA): 16 for Q/K/V
  • MLA Rank: 1,536 for Q, 512 for K/V
  • MLA Dim: 128 for QK Nope, 64 for QK Rope, and 128 for V
  • Context Length: 131,072
  • Vocabulary Size: 128,256

๐Ÿค— Model Download

Model Name Description Download
Youtu-LLM-2B-Base Base model of Youtu-LLM-2B ๐Ÿค— Model
Youtu-LLM-2B Instruct model of Youtu-LLM-2B ๐Ÿค— Model
Youtu-LLM-2B-GGUF Instruct model of Youtu-LLM-2B, in GGUF format ๐Ÿค— Model

๐Ÿ“ฐ News

  • [2026.01.28] You can now directly use Youtu-LLM with Transformers>=5.1.0.
  • [2026.01.07] You can now fine-tuning Youtu-LLM with ModelScope.
  • [2026.01.04] You can now fine-tuning Youtu-LLM with LlamaFactory.

    Note: If you wish to use Youtu-LLM-2B-Base based on earlier versions of transformers (>=4.56.0,<=4.57.1), please make sure to download the model repository before this commit.

๐Ÿ“Š Performance Comparisons

Base Model

๐Ÿ‘ Comparison between Youtu-LLM-2B-Base and baselines

General Benchmarks

Type Benchmark (Metric) # Shots Qwen3-1.7B-Base SmoLM3-3B-Base Gemma3-4B-Base Qwen3-4B-Base Llama3.1-8B Youtu-LLM-2B-Base
Commonsense MMLU-Pro (EM) 5 34.9% 35.3% 29.4% 36.2% 48.4%
MLQA-Zh (EM) 3 38.1% 38.0% 40.3% 47.2% 43.0%
MMLU-ProX-Zh (EM) 5 32.5% 26.7% 24.2% 45.2% 25.4%
STEM GSM8K (EM) 8 68.2% 67.3% 38.5% 80.8% 47.8%
MGSM-Zh (EM) 8 57.1% 40.7% 33.0% 69.7% 35.9%
MATH (EM) 4 28.1% 40.8% 24.4% 44.8% 21.5%
BBH (EM) 3 53.0% 59.8% 51.6% 70.8% 59.8%
GPQA-MC (Acc. Norm) 5 30.4% 26.6% 28.6% 37.8% 30.1%
HLE-MC (Acc. Norm) 3 10.7% 3.1% 8.0% 11.5% 17.4%
Coding MBPP (Pass@1) 3 55.6% 51.0% 45.8% 67.5% 49.4%
MBPP+ (Pass@1) 3 71.0% 66.1% 61.9% 62.7% 81.8%
HumanEval (Pass@1) 0 49.9% 34.8% 36.6% 36.0% 64.6%
HumanEval+ (Pass@1) 0 41.3% 28.1% 28.1% 28.1% 57.3%
LiveCodeBench v6 (Pass@1) 3 5.1% 2.9% 2.9% 3.4% 9.7%
CRUXEval (Pass@1) 1 40.6% 42.1% 39.7% 42.3% 55.9%
RepoBench (EM) 3 21.0% 21.8% 23.0% 25.3% 22.7%
Long Context LongBench v2 (Acc.) 3 28.8% 26.6% 25.8% 27.8% 27.2%
NIAH (Acc.) / 79.8% 75.0% 83.0% 99.8% 98.8%

Agentic Benchmarks

We takes APTBench for evaluating the agentic capabilities of base model.

Category Qwen3-1.7B-Base SmoLM3-3B-Base Gemma3-4B-Base Qwen3-4B-Base Llama3.1-8B Youtu-LLM-2B-Base
Code 25.1% 24.3% 32.8% 41.9% 23.6%
Deep Research 28.5% 27.2% 36.4% 40.5% 30.0%
Math 59.9% 60.7% 59.8% 70.5% 60.1%
Tool 56.7% 59.1% 61.7% 65.8% 64.1%

๐Ÿ“š Citation

If you find our work useful in your research, please consider citing the following paper:

@article{youtu-llm,
 title={Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models},
 author={Tencent Youtu Lab},
 year={2025},
 eprint={2512.24618},
 archivePrefix={arXiv},
 primaryClass={cs.CL},
 url={https://arxiv.org/abs/2512.24618}, 
}
Downloads last month
887
Safetensors
Model size
2B params
Tensor type
BF16
ยท

Model tree for tencent/Youtu-LLM-2B-Base

Finetunes
1 model
Quantizations
3 models

Collection including tencent/Youtu-LLM-2B-Base

Paper for tencent/Youtu-LLM-2B-Base