APEX MTP Vision MIT

Ornith-1.0-35B-MTP-APEX

Self-improving agentic coding model · APEX quantized GGUFs + BF16 + mmproj

🐦 About Ornith

Ornith-1.0-35B is a self-improving agentic coding model from DeepReinforce AI, post-trained on top of Qwen3.5 with RL to jointly optimize scaffold generation and solution rollouts.

It achieves state-of-the-art performance among open-source models of comparable size on Terminal-Bench 2.1, SWE-Bench Verified/Pro/Multilingual, NL2Repo, and OpenClaw.

This GGUF package includes the mmproj-F16.gguf vision projector for multimodal (image + text) capabilities with llama.cpp. MTP layers are sourced from Qwen3.5-35B-A3B (same architecture, compatible weights). License: MIT.

🧠 Model Details

Architecture	Qwen3.5 MoE (Mixture of Experts)
Parameters	35B total, 3B active per token
Experts	256 routed experts, 8 active per token
Layers	40 transformer layers + 1 MTP layer
Context	262,144 tokens
MTP	1 MTP layer (785 tensors) from Qwen3.5-35B-A3B
License	MIT

📊 BenchLocal Results (APEX-I-Compact, 15.85 GB)

Mode	ToolCall-15	BugFind-15	HermesAgent-20	Max	Eff.
Thinking	100	93	89	93.5	75.5
No Thinking	100	92	89	93.2	85.2

RTX 5070 Ti · No-thinking mode achieves better practical reliability (fewer retries).

🚀 Usage

llama.cpp (text only)

hf download SC117/Ornith-1.0-35B-MTP-APEX-GGUF --include "*.gguf" --local-dir ./models ./llama-server -m ./models/Ornith-1.0-35B-MTP-APEX-I-Compact.gguf -ngl 99 -c 131072

llama.cpp (vision + text)

./llama-server -m ./models/Ornith-1.0-35B-MTP-APEX-I-Compact.gguf --mmproj ./models/mmproj-F16.gguf -ngl 99 -c 131072

🎛️ Recommended Settings

Mode	Parameters
General	temperature=0.6, top_p=0.95, top_k=20
Coding	temperature=0.6, top_p=0.95, top_k=20

💡 What is APEX?

These GGUF files are quantized using APEX, an MoE-aware mixed-precision quantization technique. APEX classifies every tensor by its role — routed expert, shared expert, or attention — and applies a layer-wise precision gradient, giving sensitive edge layers higher precision and compressing redundant middle layers more aggressively.

APEX beats Q8_0 perplexity at half the size — and even beats F16.

📦 APEX Quantization Tiers

File	Size	Profile	Best For
`*-APEX-I-Quality.gguf`	21.90 GB	I-Quality	Highest quality, best accuracy
`*-APEX-I-Balanced.gguf`	24.18 GB	I-Balanced	Best all-rounder, recommended
`*-APEX-I-Compact.gguf`	15.85 GB	I-Compact	Best quality/size ratio

Citation

@misc{ornith-35b,
 title = {{Ornith-1.0-35B}: Agentic Coding, Open to All},
 url = {https://deep-reinforce.com/ornith_1_0.html},
 author = {{DeepReinforce Team}},
 year = {2026}
}

Downloads last month: 6,337

GGUF

Model size

0.4B params

Architecture

clip

Hardware compatibility

16-bit

View +3 variants

Model tree for SC117/Ornith-1.0-35B-MTP-APEX-GGUF

Base model

deepreinforce-ai/Ornith-1.0-35B

Quantized

(87)

this model

Collection including SC117/Ornith-1.0-35B-MTP-APEX-GGUF

a self-improving family of open-source models for agentic coding. • 3 items • Updated 1 day ago • 2

URL: https://huggingface.co/SC117/Ornith-1.0-35B-MTP-APEX-GGUF

⇱ SC117/Ornith-1.0-35B-MTP-APEX-GGUF · Hugging Face

Ornith-1.0-35B-MTP-APEX

Links

Citation

Model tree for SC117/Ornith-1.0-35B-MTP-APEX-GGUF

Collection including SC117/Ornith-1.0-35B-MTP-APEX-GGUF