VOOZH about

URL: https://huggingface.co/thivy/norbert4-base-nli-norwegian

⇱ thivy/norbert4-base-nli-norwegian · Hugging Face


SentenceTransformer based on ltg/norbert4-base

This is a sentence-transformers model finetuned from ltg/norbert4-base on the all-nli-norwegian dataset. It maps sentences & paragraphs to a 640-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: ltg/norbert4-base
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 640 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: no

Model Sources

Full Model Architecture

SentenceTransformer(
 (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'GptBertModel'})
 (1): Pooling({'word_embedding_dimension': 640, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("thivy/norbert4-base-nli-norwegian")
# Run inference
sentences = [
 'En mann lager et sandmaleri på gulvet.',
 'En mann lager kunst.',
 'En kvinne ødelegger et sandmaleri.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 640]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6251, 0.2931],
# [0.6251, 1.0000, 0.1305],
# [0.2931, 0.1305, 1.0000]])

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9547

Training Details

Training Dataset

all-nli-norwegian

  • Dataset: all-nli-norwegian at 98cabde
  • Size: 556,367 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 9.53 tokens
    • max: 47 tokens
    • min: 5 tokens
    • mean: 12.03 tokens
    • max: 40 tokens
    • min: 5 tokens
    • mean: 12.7 tokens
    • max: 49 tokens
  • Samples:
    anchor positive negative
    En person på en hest hopper over et havarert fly. En person er utendørs, på en hest. En person er på en diner og bestiller en omelett.
    Barn smiler og vinker til kameraet Det er barn til stede Barna rynker pannen
    En gutt hopper på skateboard midt på en rød bro. Gutten gjør et skateboardtriks. Gutten skater nedover fortauet.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
     "scale": 20.0,
     "similarity_fct": "cos_sim",
     "gather_across_devices": false
    }
    

Evaluation Dataset

all-nli-norwegian

  • Dataset: all-nli-norwegian at 98cabde
  • Size: 6,561 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 5 tokens
    • mean: 17.72 tokens
    • max: 74 tokens
    • min: 4 tokens
    • mean: 8.98 tokens
    • max: 31 tokens
    • min: 3 tokens
    • mean: 9.5 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    To kvinner klemmer mens de holder take-away pakker. To kvinner holder pakker. Mennene slåss utenfor en deli.
    To små barn i blå drakter, en med nummer 9 og en med nummer 2, står på trinn i et bad og vasker hendene i en vask. To barn i nummererte drakter vasker hendene. To barn i jakker går til skolen.
    En mann selger donuts til en kunde under et verdensutstillingsarrangement holdt i byen Angeles En mann selger donuts til en kunde. En kvinne drikker kaffen sin på en liten kafé.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
     "scale": 20.0,
     "similarity_fct": "cos_sim",
     "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Training Logs

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.3
  • PyTorch: 2.9.1
  • Accelerate: 1.12.0
  • Datasets: 4.4.2
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
 title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
 author = "Reimers, Nils and Gurevych, Iryna",
 booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
 month = "11",
 year = "2019",
 publisher = "Association for Computational Linguistics",
 url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
 title={Efficient Natural Language Response Suggestion for Smart Reply},
 author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
 year={2017},
 eprint={1705.00652},
 archivePrefix={arXiv},
 primaryClass={cs.CL}
}
Downloads last month
23
Safetensors
Model size
0.1B params
Tensor type
F32
·

Model tree for thivy/norbert4-base-nli-norwegian

Finetuned
(4)
this model

Dataset used to train thivy/norbert4-base-nli-norwegian

Papers for thivy/norbert4-base-nli-norwegian

Evaluation results