xCoT-Distill: Cross-Lingual Chain-of-Thought Distillation for Arabic Reasoning
This model is a QLoRA fine-tune of Qwen/Qwen3-8B trained to reason in English and answer in Arabic.
Method
xCoT-Distill generates (Arabic question, English CoT, Arabic answer) triples using a Qwen3-80B teacher model, then fine-tunes a student model using:
- Cross-lingual SFT with Qwen3 native thinking format
- Contrastive alignment loss on intermediate layers (10-18)
- Curriculum training: binary โ factual MCQ โ reasoning MCQ
Training Data
~17K triples from:
- OALL/Arabic_MMLU (57 subject configs)
- OALL/Arabic_EXAMS (multi-subject Arabic exams)
- MBZUAI/ACVA (Arabic cultural value alignment, True/False)
Usage
Authors
Mark Kashirskiy, Artiom Lipinski, Ilya Makarov
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
