ColBERT MUVERA Femto

This is a PyLate model finetuned from neuml/bert-hash-femto on the msmarco-en-bge-gemma unnormalized split dataset. It maps sentences & paragraphs to sequences of 50-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.

This model is trained with un-normalized scores, making it compatible with MUVERA fixed-dimensional encoding.

Usage (txtai)

This model can be used to build embeddings databases with txtai for semantic search and/or as a knowledge source for retrieval augmented generation (RAG).

Note: txtai 9.0+ is required for late interaction model support

import txtai

embeddings = txtai.Embeddings(
 path="neuml/colbert-muvera-femto",
 content=True
)
embeddings.index(documents())

# Run a query
embeddings.search("query to run")

Late interaction models excel as reranker pipelines.

from txtai.pipeline import Reranker, Similarity

similarity = Similarity(path="neuml/colbert-muvera-femto", lateencode=True)
ranker = Reranker(embeddings, similarity)
ranker("query to run")

Usage (PyLate)

Alternatively, the model can be loaded with PyLate.

from pylate import rank, models

queries = [
 "query A",
 "query B",
]

documents = [
 ["document A", "document B"],
 ["document 1", "document C", "document B"],
]

documents_ids = [
 [1, 2],
 [1, 3, 2],
]

model = models.ColBERT(
 model_name_or_path="neuml/colbert-muvera-femto",
)

queries_embeddings = model.encode(
 queries,
 is_query=True,
)

documents_embeddings = model.encode(
 documents,
 is_query=False,
)

reranked_documents = rank.rerank(
 documents_ids=documents_ids,
 queries_embeddings=queries_embeddings,
 documents_embeddings=documents_embeddings,
)

Full Model Architecture

ColBERT(
 (0): Transformer({'max_seq_length': 299, 'do_lower_case': False}) with Transformer model: BertHashModel 
 (1): Dense({'in_features': 50, 'out_features': 50, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Evaluation

BEIR Subset

The following table shows a subset of BEIR scored with the txtai benchmarks script.

Scores reported are ndcg@10 and grouped into the following three categories.

FULL multi-vector maxsim

Model	Parameters	NFCorpus	SciDocs	SciFact	Average
ColBERT v2	110M	0.3165	0.1497	0.6456	0.3706
ColBERT MUVERA Femto	0.2M	0.2513	0.0870	0.4710	0.2698
ColBERT MUVERA Pico	0.4M	0.3005	0.1117	0.6452	0.3525
ColBERT MUVERA Nano	0.9M	0.3180	0.1262	0.6576	0.3673
ColBERT MUVERA Micro	4M	0.3235	0.1244	0.6676	0.3718

MUVERA encoding + maxsim re-ranking of the top 100 results per MUVERA paper

Model	Parameters	NFCorpus	SciDocs	SciFact	Average
ColBERT v2	110M	0.3025	0.1538	0.6278	0.3614
ColBERT MUVERA Femto	0.2M	0.2316	0.0858	0.4641	0.2605
ColBERT MUVERA Pico	0.4M	0.2821	0.1004	0.6090	0.3305
ColBERT MUVERA Nano	0.9M	0.2996	0.1201	0.6249	0.3482
ColBERT MUVERA Micro	4M	0.3095	0.1228	0.6464	0.3596

MUVERA encoding only

Model	Parameters	NFCorpus	SciDocs	SciFact	Average
ColBERT v2	110M	0.2356	0.1229	0.5002	0.2862
ColBERT MUVERA Femto	0.2M	0.1851	0.0411	0.3518	0.1927
ColBERT MUVERA Pico	0.4M	0.1926	0.0564	0.4424	0.2305
ColBERT MUVERA Nano	0.9M	0.2355	0.0807	0.4904	0.2689
ColBERT MUVERA Micro	4M	0.2348	0.0882	0.4875	0.2702

Note: The scores reported don't match scores reported in the respective papers due to different default settings in the txtai benchmark scripts.

As noted earlier, models trained with min-max score normalization don't perform well with MUVERA encoding. See this GitHub Issue for more.

This model is only 250K parameters with a file size of 950K. Keeping that in mind, it's surprising how decent the scores are!

Nano BEIR

Dataset: NanoBEIR_mean
Evaluated with pylate.evaluation.nano_beir_evaluator.NanoBEIREvaluator

Metric	Value
MaxSim_accuracy@1	0.4318
MaxSim_accuracy@3	0.5753
MaxSim_accuracy@5	0.64
MaxSim_accuracy@10	0.7062
MaxSim_precision@1	0.4318
MaxSim_precision@3	0.2655
MaxSim_precision@5	0.215
MaxSim_precision@10	0.149
MaxSim_recall@1	0.2379
MaxSim_recall@3	0.3485
MaxSim_recall@5	0.4115
MaxSim_recall@10	0.4745
MaxSim_ndcg@10	0.4495
MaxSim_mrr@10	0.5194
MaxSim_map@100	0.3725

Training Details

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 32
learning_rate: 0.0003
num_train_epochs: 1
warmup_ratio: 0.05
fp16: True

All Hyperparameters

Framework Versions

Python: 3.10.18
Sentence Transformers: 4.0.2
PyLate: 1.3.2
Transformers: 4.57.0
PyTorch: 2.8.0+cu128
Accelerate: 1.10.1
Datasets: 4.1.1
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
 title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
 author = "Reimers, Nils and Gurevych, Iryna",
 booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
 month = "11",
 year = "2019",
 publisher = "Association for Computational Linguistics",
 url = "https://arxiv.org/abs/1908.10084"
}

PyLate

@misc{PyLate,
title={PyLate: Flexible Training and Retrieval for Late Interaction Models},
author={Chaffin, Antoine and Sourty, Raphaël},
url={https://github.com/lightonai/pylate},
year={2024}
}

Downloads last month: 89

Safetensors

Model size

243k params

Tensor type

F32

Model tree for NeuML/colbert-muvera-femto

Base model

NeuML/bert-hash-femto

Finetuned

(5)

this model

Dataset used to train NeuML/colbert-muvera-femto

Collection including NeuML/colbert-muvera-femto

Late interaction models • 10 items • Updated Dec 22, 2025 • 14

Papers for NeuML/colbert-muvera-femto

Paper • 2405.19504 • Published May 29, 2024 • 3

Paper • 1908.10084 • Published Aug 27, 2019 • 15

Article mentioning NeuML/colbert-muvera-femto

🥃 Distilling Tiny Embeddings

👁 Image

NeuML

•

Jan 10

• 23

Evaluation results

Maxsim Accuracy@1 on NanoClimateFEVER
self-reported
0.140
Maxsim Accuracy@3 on NanoClimateFEVER
self-reported
0.320
Maxsim Accuracy@5 on NanoClimateFEVER
self-reported
0.360
Maxsim Accuracy@10 on NanoClimateFEVER
self-reported
0.520
Maxsim Precision@1 on NanoClimateFEVER
self-reported
0.140
Maxsim Precision@3 on NanoClimateFEVER
self-reported
0.113
Maxsim Precision@5 on NanoClimateFEVER
self-reported
0.076
Maxsim Precision@10 on NanoClimateFEVER
self-reported
0.056

URL: https://huggingface.co/NeuML/colbert-muvera-femto

⇱ NeuML/colbert-muvera-femto · Hugging Face