Ornith-1.0-35B-MTP-APEX
English | π δΈζζζ‘£
Self-improving agentic coding model Β· APEX quantized GGUFs + BF16 + mmproj
Ornith-1.0-35B is a self-improving agentic coding model from DeepReinforce AI, post-trained on top of Qwen3.5 with RL to jointly optimize scaffold generation and solution rollouts.
It achieves state-of-the-art performance among open-source models of comparable size on Terminal-Bench 2.1, SWE-Bench Verified/Pro/Multilingual, NL2Repo, and OpenClaw.
This GGUF package includes the mmproj-F16.gguf vision projector for multimodal (image + text) capabilities with llama.cpp. MTP layers are sourced from Qwen3.5-35B-A3B (same architecture, compatible weights). License: MIT.
| Architecture | Qwen3.5 MoE (Mixture of Experts) |
| Parameters | 35B total, 3B active per token |
| Experts | 256 routed experts, 8 active per token |
| Layers | 40 transformer layers + 1 MTP layer |
| Context | 262,144 tokens |
| MTP | 1 MTP layer (785 tensors) from Qwen3.5-35B-A3B |
| License | MIT |
| Mode | ToolCall-15 | BugFind-15 | HermesAgent-20 | Max | Eff. |
|---|---|---|---|---|---|
| Thinking | 100 | 93 | 89 | 93.5 | 75.5 |
| No Thinking | 100 | 92 | 89 | 93.2 | 85.2 |
RTX 5070 Ti Β· No-thinking mode achieves better practical reliability (fewer retries).
llama.cpp (text only)
hf download SC117/Ornith-1.0-35B-MTP-APEX-GGUF --include "*.gguf" --local-dir ./models ./llama-server -m ./models/Ornith-1.0-35B-MTP-APEX-I-Compact.gguf -ngl 99 -c 131072
llama.cpp (vision + text)
./llama-server -m ./models/Ornith-1.0-35B-MTP-APEX-I-Compact.gguf --mmproj ./models/mmproj-F16.gguf -ngl 99 -c 131072
| Mode | Parameters |
|---|---|
| General | temperature=0.6, top_p=0.95, top_k=20 |
| Coding | temperature=0.6, top_p=0.95, top_k=20 |
These GGUF files are quantized using APEX, an MoE-aware mixed-precision quantization technique. APEX classifies every tensor by its role β routed expert, shared expert, or attention β and applies a layer-wise precision gradient, giving sensitive edge layers higher precision and compressing redundant middle layers more aggressively.
APEX beats Q8_0 perplexity at half the size β and even beats F16.
| File | Size | Profile | Best For |
|---|---|---|---|
*-APEX-I-Quality.gguf | 21.90 GB | I-Quality | Highest quality, best accuracy |
*-APEX-I-Balanced.gguf | 24.18 GB | I-Balanced | Best all-rounder, recommended |
*-APEX-I-Compact.gguf | 15.85 GB | I-Compact | Best quality/size ratio |
Links
- Original Model: https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B
- Ornith Blog: https://deep-reinforce.com/ornith.html
- APEX Quantization: https://github.com/mudler/apex-quant
- BenchLocal Results: https://scorp1o117.github.io/benchlocal-results/
Citation
@misc{ornith-35b,
title = {{Ornith-1.0-35B}: Agentic Coding, Open to All},
url = {https://deep-reinforce.com/ornith_1_0.html},
author = {{DeepReinforce Team}},
year = {2026}
}
- Downloads last month
- 6,337
16-bit
Model tree for SC117/Ornith-1.0-35B-MTP-APEX-GGUF
Base model
deepreinforce-ai/Ornith-1.0-35B