⭐ Powered by FunASR — please give us a GitHub Star!

This model is part of the FunASR ecosystem — one industrial-grade open-source toolkit for ASR · VAD · punctuation · speaker diarization · emotion / event · LLM-ASR. A Star really helps the project (and keeps you updated):

🌟 FunASR · 🌟 SenseVoice · 🌟 Fun-ASR · 🌟 FunClip

Paraformer-zh

Non-autoregressive end-to-end speech recognition — 120x realtime on GPU, production-ready for Mandarin Chinese.

Paraformer is a non-autoregressive (NAR) ASR model that generates the entire output in parallel, achieving significant speedups over autoregressive models like Whisper while maintaining competitive accuracy.

Quick Start

from funasr import AutoModel

# Basic recognition
model = AutoModel(model="funasr/paraformer-zh", hub="hf", device="cuda")
result = model.generate(input="audio.wav")
print(result[0]["text"])

Full Pipeline (VAD + ASR + Punctuation + Speaker Diarization)

from funasr import AutoModel

model = AutoModel(
 model="funasr/paraformer-zh",
 hub="hf",
 vad_model="funasr/fsmn-vad",
 punc_model="funasr/ct-punc",
 spk_model="funasr/campplus",
 device="cuda",
)

result = model.generate(input="meeting.wav")
# Output includes timestamps, punctuation, and speaker labels
for sentence in result[0]["sentence_info"]:
 print(f"[Speaker {sentence['spk']}] {sentence['text']}")

Features

120x realtime on GPU (non-autoregressive parallel decoding)
Chinese + English mixed recognition
Built-in VAD (voice activity detection) for long audio
Punctuation restoration with ct-punc model
Speaker diarization with cam++ model
Streaming and offline modes
ONNX export supported

Model Details

Property	Value
Architecture	Paraformer (Non-autoregressive)
Parameters	220M
Languages	Chinese, English
Sample Rate	16kHz
Training Data	60,000+ hours

Related Models

Model	Description	Link
funasr/fsmn-vad	Voice Activity Detection	HF
funasr/ct-punc	Punctuation Restoration	HF
funasr/campplus	Speaker Verification	HF
funasr/paraformer-zh-streaming	Streaming version	HF

Citation

@inproceedings{gao2022paraformer,
 title={Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition},
 author={Gao, Zhifu and Zhang, Shiliang and McLoughlin, Ian and Yan, Zhijie},
 booktitle={INTERSPEECH},
 year={2022}
}

Downloads last month: 1,660

Model tree for funasr/paraformer-zh

Quantizations

1 model

Spaces using funasr/paraformer-zh 6

Paper for funasr/paraformer-zh

Paper • 2206.08317 • Published Jun 16, 2022

URL: https://huggingface.co/funasr/paraformer-zh

⇱ funasr/paraformer-zh · Hugging Face