This is the collection of LID models trained based on the structures of • 2 items • Updated
This is the expand version of dleemiller/WordLlamaDetect model. Stacking two wordllama-based models to enhance the performace
- Support languages: 148
Training data (740k samples)
│
▼
┌───────────────────────────────────┐
│ Phase 1: Base Models │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │LID Model 01 │ │ LID Model 02│ │
│ └──────┬──────┘ └──────┬──────┘ │
└─────────┼────────────────┼────────┘
│ train each │
│ independently │
▼ ▼
lid_models[0] lid_models[1]
│ │
└───────┬────────┘
│
▼
collect_preds() → X: (N, 2*148) = (N, 296)
model1 logits model2 logits
(N, 148) cat (N, 148)
└────────┬──────────┘
▼
(N, 296)
│
Linear(296 → 148) ← 296*148 = 43,808 params trained
│
▼
(N, 148) → CrossEntropy(y)
Evaluation results on Flores +
| Pair | Num Languages | Accuracy | F1 Macro | Metric per Base Model |
|---|---|---|---|---|
| gemma3_27b + gemma_300m | 148 | 0.9307 | 0.9303 | gemma3_27b: Acc: 0.9147, F1: 0.9149 gemma_300m: Acc: 0.9087, F1: 0.9078 |
How to use code:
import sys
from pathlib import Path
from huggingface_hub import snapshot_download
# Download all files
local_dir = snapshot_download(repo_id="Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m")
# Load model.py
sys.path.insert(0, local_dir)
from model import LIDStack
# Load model
model = LIDStack.from_pretrained("Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m")
# Inference
print(model.predict("Hello, how are you?")) # → "eng_Latn"
print(model.predict(["Bonjour", "こんにちは"])) # → ["fra_Latn", "jpn_Jpan"]
print(model.predict("Xin chào", return_probs=True)) # → [("vie_Latn", 0.97)]
Model tree for Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m
Base model
dleemiller/WordLlamaDetect