VOOZH about

URL: https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder

โ‡ฑ Multilingual-Multimodal-NLP/IndustrialCoder ยท Hugging Face


InCoder-32B: Code Foundation Model for Industrial Scenarios

Model Summary

InCoder-32B (Industrial-Coder-32B) is the first 32B-parameter code foundation model purpose-built for industrial code intelligence. While general-purpose code LLMs excel at mainstream software tasks, they often struggle with the unique demands of industrial programming โ€” hardware semantics, specialized language constructs, strict resource constraints, and domain-specific correctness verification.

Presented in the paper InCoder-32B: Code Foundation Model for Industrial Scenarios, InCoder-32B unifies code intelligence across five industrial domains:

Domain Languages & Frameworks
๐Ÿ”ง Chip Design Verilog, SystemVerilog, RTL
โšก GPU Kernel Optimization CUDA, Triton
๐Ÿ–ฅ๏ธ Embedded Systems C/C++, ARM Cortex-M4, STM32
๐Ÿ”จ Compiler Optimization x86-64 ASM, C/C++, LLVM-IR
๐Ÿ“ 3D Modeling / CAD CadQuery, OpenCascade, Python

InCoder-32B achieves highly competitive performance on general tasks while establishing the strongest open-source baselines across all evaluated industrial domains.


Key Results

General Code Benchmarks

Benchmark InCoder-32B
SWE-bench Verified 74.8%
LiveCodeBench (Pass@1) 49.14%
BFCL v3 60.99%
HumanEval+ 89.6%
MBPP+ 78.3%
BigCodeBench (Full) 49.8%

Industrial Code Benchmarks

Benchmark Domain InCoder-32B Best Competing Open-Weight
VeriScope Score Chip Design 80.7 83.2 (GLM-5)
CAD-Coder Compile 3D Modeling 82.0% 48.0% (Kimi-K2-Thinking)
KernelBench L1 GPU Optimization 22.2% 16.2% (GLM-5)
KernelBench L2 GPU Optimization 36.0% 28.0% (KernelBench L2)

InCoder-32B leads all open-weight baselines on CAD-Coder and KernelBench (all three levels), and even surpasses proprietary models like Claude-Sonnet-4.6 on CAD-Coder IoU and KernelBench L1/L2/L3.


Model Architecture

InCoder-32B adopts a standard decoder-only Transformer architecture with the following configuration:

Hyperparameter Value
Parameters ~32B
Layers 64
Hidden Size 5,120
Max Context Length 131,072 (128K)
Positional Encoding RoPE (ฮธ = 500,000)
Precision BFloat16

Training Pipeline: Code-Flow

InCoder-32B is trained through a three-stage Code-Flow pipeline:

Stage 1 โ€” Pre-training & Annealing

  • Industrial Recall: Data pipeline using rule-based filtering, FastText classifiers, and semantic retrieval for Verilog, CUDA, firmware C, and CadQuery.
  • Refinement: OCR extraction from technical manuals, multi-level deduplication, and repository-level fork consolidation.
  • Training: 15T total tokens using Autoregressive LM + Fill-in-the-Middle (FIM) objectives.

Stage 2 โ€” Mid-Training (Context Extension)

Context window extended progressively from 8K to 128K tokens:

  • 8K โ†’ 32K: Targets file-level tasks like completing RTL modules or kernel functions.
  • 32K โ†’ 128K: Unlocks long-context capabilities for extended debugging and cross-module projects.

Stage 3 โ€” Post-Training

2.5M supervised fine-tuning (SFT) samples constructed from real industrial tasks with execution-grounded verification using toolchains like Icarus Verilog, nvcc, and Renode (STM32 simulator).


Usage

Installation

pip install transformers accelerate

Basic Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "Multilingual-Multimodal-NLP/IndustrialCoder"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
 model_id,
 torch_dtype=torch.bfloat16,
 device_map="auto"
)

prompt = """Write a synthesizable Verilog module for a UART transmitter (8N1 protocol).
The module should accept 8-bit parallel data and serialize it onto a TX line."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
 **inputs,
 max_new_tokens=1024,
 temperature=0.2,
 do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Deployment with vLLM

For production deployment, you can use vLLM to create an OpenAI-compatible API endpoint.

vllm serve Multilingual-Multimodal-NLP/IndustrialCoder --tensor-parallel-size 8

Fill-in-the-Middle (FIM)

InCoder-32B supports FIM completion for code infilling tasks:

prefix = """// CUDA kernel for RMS Normalization
__global__ void rms_norm_kernel(float* output, const float* input, 
 const float* weight, int N, float eps) {
 int idx = blockIdx.x;
"""
suffix = """
 output[idx * N + tid] = normalized * weight[tid];
}"""

fim_prompt = f"<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>"
inputs = tokenizer(fim_prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations & Disclaimers

Based on failure analysis, the model may struggle with:

  • API Knowledge: Linker errors from undefined HAL/CMSIS functions in embedded C.
  • Functional Semantics: Producing compilable but functionally incorrect RTL under complex logic scenarios.
  • Optimization: Correct but sub-optimal GPU kernel performance.

Always review and test generated code in a sandboxed environment. Industrial code (RTL, embedded firmware) requires expert review before deployment.


Citation

@article{yang2026incoder,
 title={InCoder-32B: Code Foundation Model for Industrial Scenarios},
 author={Yang, Jian and Zhang, Wei and Wu, Jiajun and Cheng, Junhang and Guo, Shawn 
 and Wang, Haowen and Gu, Weicheng and Du, Yaxin and Li, Joseph and Xu, Fanglin 
 and others},
 journal={arXiv preprint arXiv:2603.16790},
 year={2026}
}
Downloads last month
75
Safetensors
Model size
32B params
Tensor type
BF16
ยท

Model tree for Multilingual-Multimodal-NLP/IndustrialCoder

Quantizations
2 models

Space using Multilingual-Multimodal-NLP/IndustrialCoder 1

Collection including Multilingual-Multimodal-NLP/IndustrialCoder

Paper for Multilingual-Multimodal-NLP/IndustrialCoder

Evaluation results