Qwen 3.6 35B-A3B Anko

A Doubao Seed 2.0 distillation on top of Qwen 3.6 35B-A3B, intended to increase the quality of the reasoning, decrease looping, and improve generalization.

👁 image

Recommended Settings

DO NOT USE QWEN'S SAMPLERS. THEY ARE AWFUL.

This one tested with temperature of 1.1 and top_p of 0.95, but YMMV and you may find better results with other samplers.

For assistant tasks, it was trained to use a Claude system prompt:

You are Claude, a helpful and harmless language model created by Anthropic.

and we recommend using this prompt to achieve best capabilities.

Training Process

This model is a basic r=64,a=512* LoRA on reasoning traces and responses (as well as non-thinking responses) generated primarily by Doubao Seed 2.0 Pro, as well as Doubao Seed 2.0 Mini for some synthetic story tasks, as during data generation it refused erotic tasks a lot less often and creative output was mostly on par.

* This is equivalent to a r=64,a=64 rsLoRA, but some frameworks do not properly implement rsLoRA support.

Downloads last month: 165

Safetensors

Model size

35B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for allura-org/Qwen3.6-35B-A3B-Anko

Base model

Qwen/Qwen3.6-35B-A3B

Finetuned

(140)

this model

Quantizations

3 models

Collection including allura-org/Qwen3.6-35B-A3B-Anko

A series of generalized Doubao Seed 2.0 distillations • 2 items • Updated Apr 23 • 1

URL: https://huggingface.co/allura-org/Qwen3.6-35B-A3B-Anko

⇱ allura-org/Qwen3.6-35B-A3B-Anko · Hugging Face

Qwen 3.6 35B-A3B Anko

Recommended Settings

Training Process

Model tree for allura-org/Qwen3.6-35B-A3B-Anko

Collection including allura-org/Qwen3.6-35B-A3B-Anko