VOOZH about

URL: https://huggingface.co/onnx-community/punctuate-all-ONNX

⇱ onnx-community/punctuate-all-ONNX · Hugging Face


punctuate-all (ONNX)

This is an ONNX version of kredor/punctuate-all. It was automatically converted and uploaded using this Hugging Face Space.

Usage with Transformers.js

See the pipeline documentation for token-classification: https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.TokenClassificationPipeline


This is based on Oliver Guhr's work. The difference is that it is a finetuned xlm-roberta-base instead of an xlm-roberta-large and on twelve languages instead of four. The languages are: English, German, French, Spanish, Bulgarian, Italian, Polish, Dutch, Czech, Portugese, Slovak, Slovenian.

----- report -----

 precision recall f1-score support

 0 0.99 0.99 0.99 73317475
 . 0.94 0.95 0.95 4484845
 , 0.86 0.86 0.86 6100650
 ? 0.88 0.85 0.86 136479
 - 0.60 0.29 0.39 233630
 : 0.71 0.49 0.58 152424

accuracy 0.98 84425503

macro avg 0.83 0.74 0.77 84425503 weighted avg 0.98 0.98 0.98 84425503

----- confusion matrix -----

 t/p 0 . , ? - : 
 0 1.0 0.0 0.0 0.0 0.0 0.0 
 . 0.0 1.0 0.0 0.0 0.0 0.0 
 , 0.1 0.0 0.9 0.0 0.0 0.0 
 ? 0.0 0.1 0.0 0.8 0.0 0.0 
 - 0.1 0.1 0.5 0.0 0.3 0.0 
 : 0.0 0.3 0.1 0.0 0.0 0.5
Downloads last month
28,046

Model tree for onnx-community/punctuate-all-ONNX

Quantized
(3)
this model

Dataset used to train onnx-community/punctuate-all-ONNX