Reasoning-focused models for general reasoning and agentic tasks. • 2 items • Updated • 2
A newer version of this model is available: OrionLLM/GRM-2.6-Plus
GRM-1.5b is a general-purpose reasoning-focused 1.5B model fine-tuned to improve multi-domain reasoning (math, logic, coding, and broad problem-solving). It is designed to be a strong, lightweight “daily driver” for general reasoning tasks and as a solid base for further fine-tuning.
Key features
- Dedicated reasoning behavior for general tasks (stepwise problem solving, better consistency).
- Small & efficient (1.5B) — practical for local inference and experimentation.
- Multi-domain mixture: reasoning + code + math + (some) medical reasoning data.
- Fine-tune friendly: intended as a good starting point for your own SFT/GRPO/DPO pipelines.
Benchmarks
| Model | AIME24 | AIME25 | AMC23 | MATH500 | HMMT O2/25 | LCB 06/24-01/25 | CodeElo | CodeForces | GPQA-D | JEEBench |
|---|---|---|---|---|---|---|---|---|---|---|
| GRM-1.5b | 52.0 | 41.7 | 87.0 | 86.4 | 27.3 | 39.4 | 12.9 | 15.5 | 29.5 | 51.9 |
| DeepSeek-R1-Distill-Qwen-1.5B | 32.3 | 23.7 | 71.8 | 80.8 | 15.3 | 27.2 | 8.8 | 8.5 | 31.1 | 32.5 |
| Nemotron-Research-Reasoning-Qwen-1.5B | 47.7 | 32.0 | 87.5 | 86.0 | 21.7 | 31.4 | 54.7 | 40.3 | 41.8 | 52.6 |
| Qwen3-1.7B | 52.0 | 35.3 | 83.8 | 87.2 | 23.3 | 27.7 | 20.7 | 20.0 | 49.3 | 60.7 |
| Qwen2.5-1.5B-Instruct | 3.0 | 0.7 | 30.8 | 50.2 | 0.0 | 5.5 | 0.8 | 2.2 | 24.7 | 16.4 |
- Downloads last month
- 91
Safetensors
Model size
2B params
Tensor type
BF16
·
Model tree for OrionLLM/GRM-1.5b
Collection including OrionLLM/GRM-1.5b
Evaluation results
- Idavidrein/gpqa · Diamond View evaluation results leaderboard 29.5
