VOOZH about

URL: https://huggingface.co/reazon-research/japanese-wav2vec2-large

⇱ reazon-research/japanese-wav2vec2-large · Hugging Face


reazon-research/japanese-wav2vec2-large

This is a Japanese wav2vec 2.0 Large model pre-trained on ReazonSpeech v2.0 corpus.

We also release the CTC model reazon-research/japanese-wav2vec2-large-rs35kh derived from this model.

Usage

import librosa
import torch
from transformers import AutoFeatureExtractor, AutoModel

feature_extractor = AutoFeatureExtractor.from_pretrained("reazon-research/japanese-wav2vec2-large")
model = AutoModel.from_pretrained("reazon-research/japanese-wav2vec2-large")

audio, sr = librosa.load(audio_file, sr=16_000)
inputs = feature_extractor(
 audio,
 return_tensors="pt",
 sampling_rate=sr,
)
with torch.inference_mode():
 outputs = model(**inputs)

Citation

@misc{reazon-research-japanese-wav2vec2-large,
 title={japanese-wav2vec2-large},
 author={Sasaki, Yuta},
 url = {https://huggingface.co/reazon-research/japanese-wav2vec2-large},
 year = {2024}
}

License

Apaceh Licence 2.0

Downloads last month
4,605
Safetensors
Model size
0.3B params
Tensor type
F32
·
BF16
·

Model tree for reazon-research/japanese-wav2vec2-large

Finetunes
4 models

Dataset used to train reazon-research/japanese-wav2vec2-large

Collections including reazon-research/japanese-wav2vec2-large