SchGen-Qwen3.6-27B-EU

Sovereign (EU) schematic-generation model for KiCad. A QLoRA fine-tune of Qwen/Qwen3.6-27B (dense, Apache-2.0) that generates executable Python in a semantic schematic DSL to build KiCad schematics from natural-language requests, following the SchGen method (Luo et al., 2026).

Trained, fused, quantized and evaluated entirely on-premise by ailiance / L'Électron Rare, as an Apache-2.0 sovereign replacement for the non-sovereign gpt-oss-20b + microsoft/SchGen stack. This repo is the full fused BF16 model.

It is an assistant, not an autonomous EDA tool. Generated schematics must be checked with KiCad ERC/DRC before use. Serve with enable_thinking=false (see Inference). See Evaluation for a transparent, self-critical account of what the benchmarks do and do not show.

Variants

Repo	Format	Size	Use case
`Ailiance-fr/SchGen-Qwen3.6-27B-EU`	safetensors BF16 (this repo)	52 GB	Transformers / vLLM, full precision
`…-EU-lora`	PEFT/LoRA adapter	152 MB	Apply on `Qwen/Qwen3.6-27B`
`…-EU-MLX-8bit`	MLX 8-bit	27 GB	Apple Silicon (mlx-lm)
`…-EU-MLX-4bit`	MLX 4-bit	14 GB	Apple Silicon, low-memory
`…-EU-GGUF`	GGUF	varies	llama.cpp / Ollama

Intended use

Generate KiCad schematic-construction code for small-to-medium schematic modules and open-source hardware, from natural language. The model emits a small Python DSL (4 primitives) rather than raw .kicad_sch S-expressions; an executor runs that DSL to produce the schematic.

Inference

Serve with enable_thinking=false. With reasoning enabled the model tends to over-deliberate, producing very long, sometimes syntactically invalid completions and high latency; with thinking disabled it is stable and concise. Use temperature=0 for reproducible schematic code.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Ailiance-fr/SchGen-Qwen3.6-27B-EU"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto")

messages = [
 {"role": "system", "content": SCHGEN_DSL_CONTRACT}, # see Training data
 {"role": "user", "content": "A 4-pin header exposing I2C (SCL/SDA) plus 3V3 and GND."},
]
inputs = tok.apply_chat_template(
 messages, add_generation_prompt=True, return_tensors="pt",
 enable_thinking=False,
).to(model.device)
out = model.generate(input_ids=inputs, max_new_tokens=2048, temperature=0.0)
print(tok.decode(out[0][inputs.shape[-1]:], skip_special_tokens=True))

The repo ships the production chat_template.jinja.

Training method

A QLoRA fine-tune on a single NVIDIA RTX 4090 (24 GB, CUDA) — a faithful port of an MLX dense-27B recipe, but without the Apple-Silicon Metal sequence-length ceiling, so completions are not truncated at train time.

Base. Qwen/Qwen3.6-27B — a dense 27B model with a hybrid attention stack (linear/state-space linear_attn layers + standard self_attn), 64 layers. Dense (not the MoE sibling) was chosen to match the gpt-oss-20b it replaces and to avoid MoE LoRA hot-swap fragility.
QLoRA. 4-bit NF4 double-quant base, bf16 compute, paged_adamw_8bit, gradient accumulation 16. LoRA rank 32, α 1024 (scale 32), dropout 0.01, applied to the top 16 layers (48–63), targeting q/k/v/o/gate/up/down. Prompt-masked loss (loss on the completion only). Only 0.15 % of weights are trainable (39.8 M of 26.9 B params).
Sequence length: 3072 (full-context; ~2× the 1536 ceiling of the MLX/Mac run) → the long tail of the DSL completions is preserved, not cut. This is the key improvement over the Mac-trained variant.
Curriculum (3 phases, adapter-chained, decaying LR): the microsoft/SchGen pairs are partitioned into three disjoint splits (~1200 unique pairs per phase, ~3.6 K unique total; cross-phase overlap is ~0, so the phases are not the whole set repeated). Each phase file holds ~2366 lines (its split, internally duplicated ~2×). The phases are trained in sequence with 500 / 800 / 500 optimizer steps at LR 8e-6 → 5e-6 → 3e-6; each phase resumes the previous phase's adapter (chained). Per-phase mean training loss ≈ 0.034 / 0.0004 / 0.043 (phase 3 trains on a fresh, harder split, so its mean loss is higher than phase 2's).
Fusion. The phase-3 adapter is merged into the BF16 base (CPU, bit-exact merge_and_unload) → this repo. MLX 8-bit / 4-bit (group_size 64) variants are quantized from the fused model.
Reproducibility. Seed 42. Single RTX 4090 (KXKM-AI node), CUDA. Training completed 2026-06-14.

Training data

microsoft/SchGen_dataset (MIT): ~8 K NL→code pairs. Each record is a 3-message chat:

system — the full DSL contract: four primitives add_schematic_symbol(...), get_pin_location(symbol_ref, pin_name), add_label(...), connect_pins(sym_a, pin_a, sym_b, pin_b) + write_out_all_wires(), with A4-sheet placement rules.
user — a natural-language request.
assistant — executable Python using relative placement and pin-name-based wiring (e.g. connect_pins("SDA_0","1","J1","Pin_2")).

(Upstream sources are SparkFun open-hardware designs, CC-BY-SA-4.0; the released pairs are the synthesized DSL + NL, MIT.)

Validation method

Evaluated with iact-bench (EU-AI-Act audit harness), domain kicad-sch, validator kicad-pro-sch-gate — a real ERC gate — plus a separate stylistic LLM judge.

Pipeline (identical for every model; only the LLM endpoint changes): prompt → BM25 symbol pre-selection → LLM emits Python DSL → SchGen executor builds a .kicad_sch → KiCad 10.0.3 kicad-cli sch erc --severity-all → PASS iff zero violations. Deterministic: seed 42, temperature 0.
Judge (advisory): an LLM rates output 0–10 against an (empty) reference, so it measures stylistic plausibility, not electrical correctness.

How to read benchmark numbers (self-critical)

A full n=54, both-modes (thinking ON/OFF) re-evaluation of this (4090) model is in progress; this card will be updated with the final figures. Two caveats hold regardless and are essential:

Judge ⟂ ERC. The LLM judge is anti-correlated with electrical correctness (on the microsoft/SchGen baseline, ~45 % of cells judged ≥6/10 fail ERC). LLM-as-judge alone is unsafe for hardware generation.
The ERC gate is gameable by empty schematics. With no minimum-complexity guard, a header-only .kicad_sch (zero symbols) passes ERC vacuously. We therefore report, alongside the raw pass-rate, the count of non-empty passes (≥1 symbol) — the only meaningful figure.

Preliminary, thinking-disabled runs of this model reach a high valid-circuit rate on held-out prompts; thinking-enabled runs are less stable and much slower. Final audited numbers will be posted here.

Limitations

Not autonomous — always run ERC/DRC.
Serve with enable_thinking=false; reasoning mode is unstable/slow here.
Best on small-to-medium modules; not multi-sheet designs.
Sensitive to the KiCad file-format version header.

License & attribution

Apache-2.0 (see NOTICE). Derivative of Qwen/Qwen3.6-27B (Apache-2.0), trained on microsoft/SchGen_dataset (MIT); method = SchGen (MIT).

Citation

@article{luo2026schgen,
 title = {SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations},
 author = {Luo, Qinpei and Ma, Ruichun and Zhang, Xinyu and Qiu, Lili},
 journal = {arXiv preprint arXiv:2605.30345},
 year = {2026}
}

@misc{ailiance2026schgenqwen,
 title = {SchGen-Qwen3.6-27B-EU: a sovereign KiCad schematic-generation model},
 author = {{ailiance / L'Électron Rare}},
 year = {2026},
 howpublished = {\url{https://huggingface.co/Ailiance-fr/SchGen-Qwen3.6-27B-EU}}
}

Downloads last month: 51

Safetensors

Model size

27B params

Tensor type

BF16

Model tree for Ailiance-fr/SchGen-Qwen3.6-27B-EU

Base model

Qwen/Qwen3.6-27B

Finetuned

(235)

this model

Quantizations

3 models

Dataset used to train Ailiance-fr/SchGen-Qwen3.6-27B-EU

Paper for Ailiance-fr/SchGen-Qwen3.6-27B-EU

Paper • 2605.30345 • Published 23 days ago • 1

URL: https://huggingface.co/Ailiance-fr/SchGen-Qwen3.6-27B-EU

⇱ Ailiance-fr/SchGen-Qwen3.6-27B-EU · Hugging Face