long context models for MoM multilingual classifier (domain, jailbreak, pii, factual, feedback) • 12 items • Updated
mmBERT-32K Intent Classifier (Merged Model)
Full merged model for intent classification based on mmBERT-32K-YaRN (32K context, multilingual). This is the LoRA adapter merged with the base model for direct inference without PEFT.
Model Details
- Base Model: llm-semantic-router/mmbert-32k-yarn
- Training Method: LoRA (rank 32) merged into full model
- Model Size: ~1.2 GB
- Use Case: Production deployment, Rust/Go inference
Training Data
- Primary: TIGER-Lab/MMLU-Pro (~12K academic questions)
- Supplement: LLM-Semantic-Router/category-classifier-supplement (653 samples including casual "other" examples)
Categories (14 classes)
biology, business, chemistry, computer science, economics, engineering, health, history, law, math, other, philosophy, physics, psychology
Performance
| Metric | Score |
|---|---|
| Test Accuracy | 80.0% |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("llm-semantic-router/mmbert32k-intent-classifier-merged")
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/mmbert32k-intent-classifier-merged")
# Inference
inputs = tokenizer("How do neural networks learn?", return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
predicted_class = probs.argmax().item()
confidence = probs[0][predicted_class].item()
# Get label
print(f"Category: {model.config.id2label[str(predicted_class)]}, Confidence: {confidence:.2%}")
For Rust/Candle Inference
This merged model is compatible with the candle-binding Rust library for high-performance inference in production systems.
- Downloads last month
- 2,606
Safetensors
Model size
0.3B params
Tensor type
F32
·
Model tree for llm-semantic-router/mmbert32k-intent-classifier-merged
Base model
jhu-clsp/mmBERT-base Quantized
llm-semantic-router/mmbert-32k-yarn