Paper • 1908.10084 • Published • 15
SentenceTransformer based on answerdotai/ModernBERT-base
This is a sentence-transformers model finetuned from answerdotai/ModernBERT-base on the msmarco, natural_questions, gooaq, ccnews and hotpotqa datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: answerdotai/ModernBERT-base
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Datasets:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("hotchpotch/ModernBERT-embedding-CMNBRL")
# Run inference
queries = [
"what is the best paying engineering job",
]
documents = [
"The 20 highest-paying jobs for engineering majors. Engineering jobs pay well. To find out just how lucrative they really are, we turned to PayScale, the creator of the world's largest compensation database. To find the 20 highest-paying jobs for engineering majors, PayScale first identified the most common jobs for those with a bachelor's degree (and nothing more) who work full-time in the US. Chief architects and vice president's of business development topped the list, both earning an impressive $151,000 a year.",
'Aviation is a combat arms branch which encompasses 80 percent of the commissioned officer operational flying positions within the Army (less those in Aviation Material Management and Medical Service Corps).',
'Depending on the thickness and size of the chop, it can take anywhere from eight to 30 minutes. Hereâ\x80\x99s a helpful cooking chart and some tips to achieve delicious pork chops every time. Pork chops are a crowd pleaser, especially once you master your grilling technique. For safe consumption, itâ\x80\x99s recommended to cook pork until it reaches an internal temperature of 145°F or 65°C. Depending on the cut and thickness of your chop, the time it may take to reach this can vary. To make sure your chops are the right temperature, use a digital meat thermometer.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.9709, 0.7909, 0.6977]])
Evaluation
Metrics
Information Retrieval
- Datasets:
NanoClimateFEVER,NanoDBPedia,NanoFEVER,NanoFiQA2018,NanoHotpotQA,NanoMSMARCO,NanoNFCorpus,NanoNQ,NanoQuoraRetrieval,NanoSCIDOCS,NanoArguAna,NanoSciFactandNanoTouche2020 - Evaluated with
InformationRetrievalEvaluator
| Metric | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cosine_accuracy@10 | 0.68 | 0.94 | 0.98 | 0.74 | 0.94 | 0.84 | 0.7 | 0.78 | 1.0 | 0.82 | 0.9 | 0.8 | 0.9388 |
| cosine_precision@10 | 0.09 | 0.39 | 0.102 | 0.122 | 0.13 | 0.084 | 0.256 | 0.084 | 0.132 | 0.176 | 0.09 | 0.092 | 0.4102 |
| cosine_recall@10 | 0.374 | 0.2684 | 0.9333 | 0.5628 | 0.65 | 0.84 | 0.1351 | 0.76 | 0.986 | 0.3597 | 0.9 | 0.8 | 0.282 |
| cosine_ndcg@10 | 0.3204 | 0.5013 | 0.7971 | 0.4595 | 0.6496 | 0.5915 | 0.2981 | 0.6279 | 0.9387 | 0.3413 | 0.5898 | 0.6514 | 0.4762 |
| cosine_mrr@10 | 0.4207 | 0.7497 | 0.7732 | 0.5142 | 0.8306 | 0.5118 | 0.4498 | 0.5953 | 0.9367 | 0.5092 | 0.4908 | 0.6098 | 0.7203 |
| cosine_map@10 | 0.2385 | 0.3713 | 0.7398 | 0.3761 | 0.5639 | 0.5118 | 0.2048 | 0.576 | 0.9121 | 0.2305 | 0.4908 | 0.5992 | 0.326 |
Nano BEIR
- Dataset:
NanoBEIR_mean - Evaluated with
NanoBEIREvaluatorwith these parameters:{ "dataset_names": [ "climatefever", "dbpedia", "fever", "fiqa2018", "hotpotqa", "msmarco", "nfcorpus", "nq", "quoraretrieval", "scidocs", "arguana", "scifact", "touche2020" ], "dataset_id": "sentence-transformers/NanoBEIR-en" }
| Metric | Value |
|---|---|
| cosine_accuracy@10 | 0.8507 |
| cosine_precision@10 | 0.166 |
| cosine_recall@10 | 0.604 |
| cosine_ndcg@10 | 0.5571 |
| cosine_mrr@10 | 0.624 |
| cosine_map@10 | 0.4724 |
Training Details
Training Datasets
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 8192per_device_eval_batch_size: 512learning_rate: 0.0001weight_decay: 0.01num_train_epochs: 1lr_scheduler_type: cosinewarmup_ratio: 0.1seed: 12bf16: Truedataloader_drop_last: Truedataloader_num_workers: 12dataloader_prefetch_factor: 2remove_unused_columns: Falseoptim: adamw_torchbatch_sampler: no_duplicates
All Hyperparameters
Training Logs
| Epoch | Step | Training Loss | NanoClimateFEVER_cosine_ndcg@10 | NanoDBPedia_cosine_ndcg@10 | NanoFEVER_cosine_ndcg@10 | NanoFiQA2018_cosine_ndcg@10 | NanoHotpotQA_cosine_ndcg@10 | NanoMSMARCO_cosine_ndcg@10 | NanoNFCorpus_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoQuoraRetrieval_cosine_ndcg@10 | NanoSCIDOCS_cosine_ndcg@10 | NanoArguAna_cosine_ndcg@10 | NanoSciFact_cosine_ndcg@10 | NanoTouche2020_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.0190 | 10 | 11.3289 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0381 | 20 | 7.5743 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0571 | 30 | 5.4003 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0762 | 40 | 3.399 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0952 | 50 | 2.7399 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1143 | 60 | 2.415 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1333 | 70 | 2.3843 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1524 | 80 | 1.9827 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1714 | 90 | 1.8858 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1905 | 100 | 1.7143 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2095 | 110 | 2.0079 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2286 | 120 | 1.8461 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2476 | 130 | 1.7032 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2667 | 140 | 1.6531 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2857 | 150 | 1.9902 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3048 | 160 | 1.6245 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3238 | 170 | 1.685 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3429 | 180 | 1.657 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3619 | 190 | 1.8747 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3810 | 200 | 1.4671 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4 | 210 | 1.5957 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4190 | 220 | 1.5083 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4381 | 230 | 1.5014 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4571 | 240 | 1.4548 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4762 | 250 | 1.5598 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4952 | 260 | 1.3879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5143 | 270 | 1.5633 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5333 | 280 | 1.5092 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5524 | 290 | 1.4434 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5714 | 300 | 1.5024 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5905 | 310 | 1.511 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6095 | 320 | 1.4404 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6286 | 330 | 1.6083 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6476 | 340 | 1.4197 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6667 | 350 | 1.5548 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6857 | 360 | 1.5642 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7048 | 370 | 1.4709 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7238 | 380 | 1.482 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7429 | 390 | 1.5472 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7619 | 400 | 1.4899 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7810 | 410 | 1.3321 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8 | 420 | 1.5174 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8190 | 430 | 1.3945 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8381 | 440 | 1.5877 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8571 | 450 | 1.3143 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8762 | 460 | 1.5343 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8952 | 470 | 1.4968 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9143 | 480 | 1.4361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9333 | 490 | 1.4353 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9524 | 500 | 1.3146 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9714 | 510 | 1.3722 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9905 | 520 | 1.3098 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0 | 521 | - | 0.3204 | 0.5013 | 0.7971 | 0.4595 | 0.6496 | 0.5915 | 0.2981 | 0.6279 | 0.9387 | 0.3413 | 0.5898 | 0.6514 | 0.4762 | 0.5571 |
Framework Versions
- Python: 3.11.14
- Sentence Transformers: 5.3.0.dev0
- Transformers: 4.57.1
- PyTorch: 2.8.0+cu129
- Accelerate: 1.12.0
- Datasets: 4.4.1
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 15
Safetensors
Model size
0.1B params
Tensor type
F32
·
Model tree for hotchpotch/ModernBERT-embedding-CMNBRL
Base model
answerdotai/ModernBERT-baseDatasets used to train hotchpotch/ModernBERT-embedding-CMNBRL
Paper for hotchpotch/ModernBERT-embedding-CMNBRL
Evaluation results
- Cosine Accuracy@10 on NanoClimateFEVERself-reported0.680
- Cosine Precision@10 on NanoClimateFEVERself-reported0.090
- Cosine Recall@10 on NanoClimateFEVERself-reported0.374
- Cosine Ndcg@10 on NanoClimateFEVERself-reported0.320
- Cosine Mrr@10 on NanoClimateFEVERself-reported0.421
- Cosine Map@10 on NanoClimateFEVERself-reported0.238
- Cosine Accuracy@10 on NanoDBPediaself-reported0.940
- Cosine Precision@10 on NanoDBPediaself-reported0.390
