ICLR2026: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners https://arxiv.org/abs/2509.26226 • 14 items • Updated • 1
README.md exists but content is empty.
- Downloads last month
- 12
Safetensors
Model size
4B params
Tensor type
BF16
·
