You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Please read Apache License, Version 2.0 before downloading this model.

`imprt/kushinada-hubert-large`

This is a Japanese HuBERT Large model pre-trained using 62215 hours of audio extracted from large-scale Japanese TV broadcast audio data by voice activity detection.
This model was trained using code from the official repository.

Usage

import soundfile as sf
from transformers import AutoFeatureExtractor
model = "imprt/kushinada-hubert-large"
feature_extractor = AutoFeatureExtractor.from_pretrained(model)
audio_file="/path/to/16k_audio_file"
audio_input, sr = sf.read(audio_file)
feature_extractor(audio_input, sampling_rate=sr)

References

@article{journals/corr/abs-2106-07447,
 added-at = {2021-06-16T00:00:00.000+0200},
 author = {Hsu, Wei-Ning and Bolte, Benjamin and Tsai, Yao-Hung Hubert and Lakhotia, Kushal and Salakhutdinov, Ruslan and Mohamed, Abdelrahman},
 biburl = {https://www.bibsonomy.org/bibtex/2435bd8c9ac37a4eab204ded15e9f8918/dblp},
 ee = {https://arxiv.org/abs/2106.07447},
 interhash = {c85407653eddc9c9256c261afe8d6954},
 intrahash = {435bd8c9ac37a4eab204ded15e9f8918},
 journal = {CoRR},
 keywords = {dblp},
 timestamp = {2024-04-08T22:55:35.000+0200},
 title = {HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units.},
 url = {http://dblp.uni-trier.de/db/journals/corr/corr2106.html#abs-2106-07447},
 volume = {abs/2106.07447},
 year = 2021
}