Paper โข 2502.14673 โข Published โข 1
ChunkFormer-Large-Vie: Large-Scale Pretrained ChunkFormer for Vietnamese Automatic Speech Recognition
๐ Ranked #1: Speech Recognition on Common Voice Vi
๐ Ranked #1: Speech Recognition on VIVOS
๐ License: CC BY-NC 4.0
๐ GitHub
๐ Paper
๐ Model size
Citation
If you use this work in your research, please cite:
@INPROCEEDINGS{10888640,
author={Le, Khanh and Ho, Tuan Vu and Tran, Dung and Chau, Duc Thanh},
booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription},
year={2025},
volume={},
number={},
pages={1-5},
keywords={Scalability;Memory management;Graphics processing units;Signal processing;Performance gain;Hardware;Resource management;Speech processing;Standards;Context modeling;chunkformer;masked batch;long-form transcription},
doi={10.1109/ICASSP49660.2025.10888640}}
}
Contact
- Downloads last month
- 5
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Datasets used to train songlindotiot/chunkformer-large-vie
Paper for songlindotiot/chunkformer-large-vie
Evaluation results
- Test WER on common-voice-vietnameseCommon Voice Vi Leaderboard6.660
- Test WER on VIVOSVivos Leaderboard4.180
- Test WER on VLSP - Task 1self-reported14.090
