VOOZH about

URL: https://huggingface.co/hotchpotch/ModernBERT-embedding-CMNBRL

⇱ hotchpotch/ModernBERT-embedding-CMNBRL · Hugging Face


SentenceTransformer based on answerdotai/ModernBERT-base

This is a sentence-transformers model finetuned from answerdotai/ModernBERT-base on the msmarco, natural_questions, gooaq, ccnews and hotpotqa datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
 (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
 (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("hotchpotch/ModernBERT-embedding-CMNBRL")
# Run inference
queries = [
 "what is the best paying engineering job",
]
documents = [
 "The 20 highest-paying jobs for engineering majors. Engineering jobs pay well. To find out just how lucrative they really are, we turned to PayScale, the creator of the world's largest compensation database. To find the 20 highest-paying jobs for engineering majors, PayScale first identified the most common jobs for those with a bachelor's degree (and nothing more) who work full-time in the US. Chief architects and vice president's of business development topped the list, both earning an impressive $151,000 a year.",
 'Aviation is a combat arms branch which encompasses 80 percent of the commissioned officer operational flying positions within the Army (less those in Aviation Material Management and Medical Service Corps).',
 'Depending on the thickness and size of the chop, it can take anywhere from eight to 30 minutes. Hereâ\x80\x99s a helpful cooking chart and some tips to achieve delicious pork chops every time. Pork chops are a crowd pleaser, especially once you master your grilling technique. For safe consumption, itâ\x80\x99s recommended to cook pork until it reaches an internal temperature of 145°F or 65°C. Depending on the cut and thickness of your chop, the time it may take to reach this can vary. To make sure your chops are the right temperature, use a digital meat thermometer.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.9709, 0.7909, 0.6977]])

Evaluation

Metrics

Information Retrieval

  • Datasets: NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020
  • Evaluated with InformationRetrievalEvaluator
Metric NanoClimateFEVER NanoDBPedia NanoFEVER NanoFiQA2018 NanoHotpotQA NanoMSMARCO NanoNFCorpus NanoNQ NanoQuoraRetrieval NanoSCIDOCS NanoArguAna NanoSciFact NanoTouche2020
cosine_accuracy@10 0.68 0.94 0.98 0.74 0.94 0.84 0.7 0.78 1.0 0.82 0.9 0.8 0.9388
cosine_precision@10 0.09 0.39 0.102 0.122 0.13 0.084 0.256 0.084 0.132 0.176 0.09 0.092 0.4102
cosine_recall@10 0.374 0.2684 0.9333 0.5628 0.65 0.84 0.1351 0.76 0.986 0.3597 0.9 0.8 0.282
cosine_ndcg@10 0.3204 0.5013 0.7971 0.4595 0.6496 0.5915 0.2981 0.6279 0.9387 0.3413 0.5898 0.6514 0.4762
cosine_mrr@10 0.4207 0.7497 0.7732 0.5142 0.8306 0.5118 0.4498 0.5953 0.9367 0.5092 0.4908 0.6098 0.7203
cosine_map@10 0.2385 0.3713 0.7398 0.3761 0.5639 0.5118 0.2048 0.576 0.9121 0.2305 0.4908 0.5992 0.326

Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with NanoBEIREvaluator with these parameters:
    {
     "dataset_names": [
     "climatefever",
     "dbpedia",
     "fever",
     "fiqa2018",
     "hotpotqa",
     "msmarco",
     "nfcorpus",
     "nq",
     "quoraretrieval",
     "scidocs",
     "arguana",
     "scifact",
     "touche2020"
     ],
     "dataset_id": "sentence-transformers/NanoBEIR-en"
    }
    
Metric Value
cosine_accuracy@10 0.8507
cosine_precision@10 0.166
cosine_recall@10 0.604
cosine_ndcg@10 0.5571
cosine_mrr@10 0.624
cosine_map@10 0.4724

Training Details

Training Datasets

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 8192
  • per_device_eval_batch_size: 512
  • learning_rate: 0.0001
  • weight_decay: 0.01
  • num_train_epochs: 1
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_drop_last: True
  • dataloader_num_workers: 12
  • dataloader_prefetch_factor: 2
  • remove_unused_columns: False
  • optim: adamw_torch
  • batch_sampler: no_duplicates

All Hyperparameters

Training Logs

Epoch Step Training Loss NanoClimateFEVER_cosine_ndcg@10 NanoDBPedia_cosine_ndcg@10 NanoFEVER_cosine_ndcg@10 NanoFiQA2018_cosine_ndcg@10 NanoHotpotQA_cosine_ndcg@10 NanoMSMARCO_cosine_ndcg@10 NanoNFCorpus_cosine_ndcg@10 NanoNQ_cosine_ndcg@10 NanoQuoraRetrieval_cosine_ndcg@10 NanoSCIDOCS_cosine_ndcg@10 NanoArguAna_cosine_ndcg@10 NanoSciFact_cosine_ndcg@10 NanoTouche2020_cosine_ndcg@10 NanoBEIR_mean_cosine_ndcg@10
0.0190 10 11.3289 - - - - - - - - - - - - - -
0.0381 20 7.5743 - - - - - - - - - - - - - -
0.0571 30 5.4003 - - - - - - - - - - - - - -
0.0762 40 3.399 - - - - - - - - - - - - - -
0.0952 50 2.7399 - - - - - - - - - - - - - -
0.1143 60 2.415 - - - - - - - - - - - - - -
0.1333 70 2.3843 - - - - - - - - - - - - - -
0.1524 80 1.9827 - - - - - - - - - - - - - -
0.1714 90 1.8858 - - - - - - - - - - - - - -
0.1905 100 1.7143 - - - - - - - - - - - - - -
0.2095 110 2.0079 - - - - - - - - - - - - - -
0.2286 120 1.8461 - - - - - - - - - - - - - -
0.2476 130 1.7032 - - - - - - - - - - - - - -
0.2667 140 1.6531 - - - - - - - - - - - - - -
0.2857 150 1.9902 - - - - - - - - - - - - - -
0.3048 160 1.6245 - - - - - - - - - - - - - -
0.3238 170 1.685 - - - - - - - - - - - - - -
0.3429 180 1.657 - - - - - - - - - - - - - -
0.3619 190 1.8747 - - - - - - - - - - - - - -
0.3810 200 1.4671 - - - - - - - - - - - - - -
0.4 210 1.5957 - - - - - - - - - - - - - -
0.4190 220 1.5083 - - - - - - - - - - - - - -
0.4381 230 1.5014 - - - - - - - - - - - - - -
0.4571 240 1.4548 - - - - - - - - - - - - - -
0.4762 250 1.5598 - - - - - - - - - - - - - -
0.4952 260 1.3879 - - - - - - - - - - - - - -
0.5143 270 1.5633 - - - - - - - - - - - - - -
0.5333 280 1.5092 - - - - - - - - - - - - - -
0.5524 290 1.4434 - - - - - - - - - - - - - -
0.5714 300 1.5024 - - - - - - - - - - - - - -
0.5905 310 1.511 - - - - - - - - - - - - - -
0.6095 320 1.4404 - - - - - - - - - - - - - -
0.6286 330 1.6083 - - - - - - - - - - - - - -
0.6476 340 1.4197 - - - - - - - - - - - - - -
0.6667 350 1.5548 - - - - - - - - - - - - - -
0.6857 360 1.5642 - - - - - - - - - - - - - -
0.7048 370 1.4709 - - - - - - - - - - - - - -
0.7238 380 1.482 - - - - - - - - - - - - - -
0.7429 390 1.5472 - - - - - - - - - - - - - -
0.7619 400 1.4899 - - - - - - - - - - - - - -
0.7810 410 1.3321 - - - - - - - - - - - - - -
0.8 420 1.5174 - - - - - - - - - - - - - -
0.8190 430 1.3945 - - - - - - - - - - - - - -
0.8381 440 1.5877 - - - - - - - - - - - - - -
0.8571 450 1.3143 - - - - - - - - - - - - - -
0.8762 460 1.5343 - - - - - - - - - - - - - -
0.8952 470 1.4968 - - - - - - - - - - - - - -
0.9143 480 1.4361 - - - - - - - - - - - - - -
0.9333 490 1.4353 - - - - - - - - - - - - - -
0.9524 500 1.3146 - - - - - - - - - - - - - -
0.9714 510 1.3722 - - - - - - - - - - - - - -
0.9905 520 1.3098 - - - - - - - - - - - - - -
0 521 - 0.3204 0.5013 0.7971 0.4595 0.6496 0.5915 0.2981 0.6279 0.9387 0.3413 0.5898 0.6514 0.4762 0.5571

Framework Versions

  • Python: 3.11.14
  • Sentence Transformers: 5.3.0.dev0
  • Transformers: 4.57.1
  • PyTorch: 2.8.0+cu129
  • Accelerate: 1.12.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
 title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
 author = "Reimers, Nils and Gurevych, Iryna",
 booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
 month = "11",
 year = "2019",
 publisher = "Association for Computational Linguistics",
 url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
15
Safetensors
Model size
0.1B params
Tensor type
F32
·

Model tree for hotchpotch/ModernBERT-embedding-CMNBRL

Finetuned
(1335)
this model

Datasets used to train hotchpotch/ModernBERT-embedding-CMNBRL

Paper for hotchpotch/ModernBERT-embedding-CMNBRL

Evaluation results