SchGen-Qwen3.6-27B-EU
Sovereign (EU) schematic-generation model for KiCad. A QLoRA fine-tune of
Qwen/Qwen3.6-27B (dense,
Apache-2.0) that generates executable Python in a semantic schematic DSL to
build KiCad schematics from natural-language requests, following the SchGen
method (Luo et al., 2026).
Trained, fused, quantized and evaluated entirely on-premise by
ailiance / L'Électron Rare, as an
Apache-2.0 sovereign replacement for the non-sovereign gpt-oss-20b + microsoft/SchGen stack. This repo is the full fused BF16 model.
It is an assistant, not an autonomous EDA tool. Generated schematics must be checked with KiCad ERC/DRC before use. Serve with
enable_thinking=false(see Inference). See Evaluation for a transparent, self-critical account of what the benchmarks do and do not show.
Variants
| Repo | Format | Size | Use case |
|---|---|---|---|
Ailiance-fr/SchGen-Qwen3.6-27B-EU |
safetensors BF16 (this repo) | 52 GB | Transformers / vLLM, full precision |
…-EU-lora |
PEFT/LoRA adapter | 152 MB | Apply on Qwen/Qwen3.6-27B |
…-EU-MLX-8bit |
MLX 8-bit | 27 GB | Apple Silicon (mlx-lm) |
…-EU-MLX-4bit |
MLX 4-bit | 14 GB | Apple Silicon, low-memory |
…-EU-GGUF |
GGUF | varies | llama.cpp / Ollama |
Intended use
Generate KiCad schematic-construction code for small-to-medium schematic
modules and open-source hardware, from natural language. The model emits a
small Python DSL (4 primitives) rather than raw .kicad_sch S-expressions;
an executor runs that DSL to produce the schematic.
Inference
Serve with enable_thinking=false. With reasoning enabled the model tends
to over-deliberate, producing very long, sometimes syntactically invalid
completions and high latency; with thinking disabled it is stable and concise.
Use temperature=0 for reproducible schematic code.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Ailiance-fr/SchGen-Qwen3.6-27B-EU"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto")
messages = [
{"role": "system", "content": SCHGEN_DSL_CONTRACT}, # see Training data
{"role": "user", "content": "A 4-pin header exposing I2C (SCL/SDA) plus 3V3 and GND."},
]
inputs = tok.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt",
enable_thinking=False,
).to(model.device)
out = model.generate(input_ids=inputs, max_new_tokens=2048, temperature=0.0)
print(tok.decode(out[0][inputs.shape[-1]:], skip_special_tokens=True))
The repo ships the production chat_template.jinja.
Training method
A QLoRA fine-tune on a single NVIDIA RTX 4090 (24 GB, CUDA) — a faithful port of an MLX dense-27B recipe, but without the Apple-Silicon Metal sequence-length ceiling, so completions are not truncated at train time.
- Base.
Qwen/Qwen3.6-27B— a dense 27B model with a hybrid attention stack (linear/state-spacelinear_attnlayers + standardself_attn), 64 layers. Dense (not the MoE sibling) was chosen to match thegpt-oss-20bit replaces and to avoid MoE LoRA hot-swap fragility. - QLoRA. 4-bit NF4 double-quant base, bf16 compute,
paged_adamw_8bit, gradient accumulation 16. LoRA rank 32, α 1024
(scale 32), dropout 0.01, applied to the top 16 layers (48–63),
targeting
q/k/v/o/gate/up/down. Prompt-masked loss (loss on the completion only). Only 0.15 % of weights are trainable (39.8 M of 26.9 B params). - Sequence length: 3072 (full-context; ~2× the 1536 ceiling of the MLX/Mac run) → the long tail of the DSL completions is preserved, not cut. This is the key improvement over the Mac-trained variant.
- Curriculum (3 phases, adapter-chained, decaying LR): the
microsoft/SchGenpairs are partitioned into three disjoint splits (~1200 unique pairs per phase, ~3.6 K unique total; cross-phase overlap is ~0, so the phases are not the whole set repeated). Each phase file holds ~2366 lines (its split, internally duplicated ~2×). The phases are trained in sequence with 500 / 800 / 500 optimizer steps at LR 8e-6 → 5e-6 → 3e-6; each phase resumes the previous phase's adapter (chained). Per-phase mean training loss ≈ 0.034 / 0.0004 / 0.043 (phase 3 trains on a fresh, harder split, so its mean loss is higher than phase 2's). - Fusion. The phase-3 adapter is merged into the BF16 base (CPU,
bit-exact
merge_and_unload) → this repo. MLX 8-bit / 4-bit (group_size 64) variants are quantized from the fused model. - Reproducibility. Seed 42. Single RTX 4090 (KXKM-AI node), CUDA. Training completed 2026-06-14.
Training data
microsoft/SchGen_dataset
(MIT): ~8 K NL→code pairs. Each record is a 3-message chat:
- system — the full DSL contract: four primitives
add_schematic_symbol(...),get_pin_location(symbol_ref, pin_name),add_label(...),connect_pins(sym_a, pin_a, sym_b, pin_b)+write_out_all_wires(), with A4-sheet placement rules. - user — a natural-language request.
- assistant — executable Python using relative placement and
pin-name-based wiring (e.g.
connect_pins("SDA_0","1","J1","Pin_2")).
(Upstream sources are SparkFun open-hardware designs, CC-BY-SA-4.0; the released pairs are the synthesized DSL + NL, MIT.)
Validation method
Evaluated with iact-bench (EU-AI-Act audit harness), domain kicad-sch,
validator kicad-pro-sch-gate — a real ERC gate — plus a separate
stylistic LLM judge.
- Pipeline (identical for every model; only the LLM endpoint changes):
prompt → BM25 symbol pre-selection → LLM emits Python DSL → SchGen
executor builds a
.kicad_sch→ KiCad 10.0.3kicad-cli sch erc --severity-all→ PASS iff zero violations. Deterministic: seed 42, temperature 0. - Judge (advisory): an LLM rates output 0–10 against an (empty) reference, so it measures stylistic plausibility, not electrical correctness.
How to read benchmark numbers (self-critical)
A full n=54, both-modes (thinking ON/OFF) re-evaluation of this (4090)
model is in progress; this card will be updated with the final figures. Two
caveats hold regardless and are essential:
- Judge ⟂ ERC. The LLM judge is anti-correlated with electrical
correctness (on the
microsoft/SchGenbaseline, ~45 % of cells judged ≥6/10 fail ERC). LLM-as-judge alone is unsafe for hardware generation. - The ERC gate is gameable by empty schematics. With no
minimum-complexity guard, a header-only
.kicad_sch(zero symbols) passes ERC vacuously. We therefore report, alongside the raw pass-rate, the count of non-empty passes (≥1 symbol) — the only meaningful figure.
Preliminary, thinking-disabled runs of this model reach a high valid-circuit rate on held-out prompts; thinking-enabled runs are less stable and much slower. Final audited numbers will be posted here.
Limitations
- Not autonomous — always run ERC/DRC.
- Serve with
enable_thinking=false; reasoning mode is unstable/slow here. - Best on small-to-medium modules; not multi-sheet designs.
- Sensitive to the KiCad file-format version header.
License & attribution
Apache-2.0 (see NOTICE). Derivative of Qwen/Qwen3.6-27B (Apache-2.0),
trained on microsoft/SchGen_dataset (MIT); method = SchGen (MIT).
Citation
@article{luo2026schgen,
title = {SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations},
author = {Luo, Qinpei and Ma, Ruichun and Zhang, Xinyu and Qiu, Lili},
journal = {arXiv preprint arXiv:2605.30345},
year = {2026}
}
@misc{ailiance2026schgenqwen,
title = {SchGen-Qwen3.6-27B-EU: a sovereign KiCad schematic-generation model},
author = {{ailiance / L'Électron Rare}},
year = {2026},
howpublished = {\url{https://huggingface.co/Ailiance-fr/SchGen-Qwen3.6-27B-EU}}
}
- Downloads last month
- 51
