VOOZH about

URL: https://huggingface.co/WhereIsAI/UAE-Large-V1

โ‡ฑ WhereIsAI/UAE-Large-V1 ยท Hugging Face


Universal AnglE Embedding

๐Ÿ“ข WhereIsAI/UAE-Large-V1 is licensed under MIT. Feel free to use it in any scenario. If you use it for academic papers, you could cite us via ๐Ÿ‘‰ citation info.

๐Ÿค Follow us on:

Welcome to using AnglE to train and infer powerful sentence embeddings.

๐Ÿ† Achievements

  • ๐Ÿ“… May 16, 2024 | AnglE's paper is accepted by ACL 2024 Main Conference
  • ๐Ÿ“… Dec 4, 2023 | ๐Ÿ”ฅ Our universal English sentence embedding WhereIsAI/UAE-Large-V1 achieves SOTA on the MTEB Leaderboard with an average score of 64.64!

๐Ÿ‘ image/jpeg

๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘ Siblings:

Usage

1. angle_emb

python -m pip install -U angle-emb
  1. Non-Retrieval Tasks

There is no need to specify any prompts.

from angle_emb import AnglE
from angle_emb.utils import cosine_similarity

angle = AnglE.from_pretrained('WhereIsAI/UAE-Large-V1', pooling_strategy='cls').cuda()
doc_vecs = angle.encode([
 'The weather is great!',
 'The weather is very good!',
 'i am going to bed'
], normalize_embedding=True)

for i, dv1 in enumerate(doc_vecs):
 for dv2 in doc_vecs[i+1:]:
 print(cosine_similarity(dv1, dv2))
  1. Retrieval Tasks

For retrieval purposes, please use the prompt Prompts.C for query (not for document).

from angle_emb import AnglE, Prompts
from angle_emb.utils import cosine_similarity

angle = AnglE.from_pretrained('WhereIsAI/UAE-Large-V1', pooling_strategy='cls').cuda()
qv = angle.encode(Prompts.C.format(text='what is the weather?'))
doc_vecs = angle.encode([
 'The weather is great!',
 'it is rainy today.',
 'i am going to bed'
])

for dv in doc_vecs:
 print(cosine_similarity(qv[0], dv))

2. sentence transformer

from angle_emb import Prompts
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("WhereIsAI/UAE-Large-V1").cuda()

qv = model.encode(Prompts.C.format(text='what is the weather?'))
doc_vecs = model.encode([
 'The weather is great!',
 'it is rainy today.',
 'i am going to bed'
])

for dv in doc_vecs:
 print(1 - spatial.distance.cosine(qv, dv))

3. Infinity

Infinity is a MIT licensed server for OpenAI-compatible deployment.

docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
michaelf34/infinity:latest \
v2 --model-id WhereIsAI/UAE-Large-V1 --revision "369c368f70f16a613f19f5598d4f12d9f44235d4" --dtype float16 --batch-size 32 --device cuda --engine torch --port 7997

Citation

If you use our pre-trained models, welcome to support us by citing our work:

@article{li2023angle,
 title={AnglE-optimized Text Embeddings},
 author={Li, Xianming and Li, Jing},
 journal={arXiv preprint arXiv:2309.12871},
 year={2023}
}
Downloads last month
1,817,020
Safetensors
Model size
0.3B params
Tensor type
F32
ยท

Model tree for WhereIsAI/UAE-Large-V1

Adapters
2 models
Finetunes
10 models
Merges
1 model
Quantizations
6 models

Spaces using WhereIsAI/UAE-Large-V1 69

Collection including WhereIsAI/UAE-Large-V1

Paper for WhereIsAI/UAE-Large-V1

Evaluation results