Open Source Long Context Text Embedders • 8 items • Updated • 24
nomic-embed-text-v1-unsupervised: A Reproducible Long Context (8192) Text Embedder
nomic-embed-text-v1-unsupervised is 8192 context length text encoder. This is a checkpoint after contrastive pretraining from multi-stage contrastive training of the
final model. The purpose of releasing this checkpoint is to open-source training artifacts from our Nomic Embed Text tech report here
If you want to use a model to extract embeddings, we suggest using nomic-embed-text-v1.
Join the Nomic Community
- Nomic: https://nomic.ai
- Discord: https://discord.gg/myY5YDR8z8
- Twitter: https://twitter.com/nomic_ai
- Downloads last month
- 475
Model tree for nomic-ai/nomic-embed-text-v1-unsupervised
Quantizations
1 modelSpaces using nomic-ai/nomic-embed-text-v1-unsupervised 20
Collection including nomic-ai/nomic-embed-text-v1-unsupervised
Paper for nomic-ai/nomic-embed-text-v1-unsupervised
Evaluation results
- accuracy on MTEB AmazonCounterfactualClassification (en)test set self-reported76.985
- ap on MTEB AmazonCounterfactualClassification (en)test set self-reported39.472
- f1 on MTEB AmazonCounterfactualClassification (en)test set self-reported70.592
- accuracy on MTEB AmazonPolarityClassificationtest set self-reported87.540
- ap on MTEB AmazonPolarityClassificationtest set self-reported83.161
- f1 on MTEB AmazonPolarityClassificationtest set self-reported87.523
- accuracy on MTEB AmazonReviewsClassification (en)test set self-reported46.808
- f1 on MTEB AmazonReviewsClassification (en)test set self-reported46.263
