👁 Jina AI: Your Search Foundation, Supercharged!
The embedding model trained by Jina AI.
Jina Embeddings v4: Universal Embeddings for Multimodal Multilingual Retrieval
GGUF | Blog | Technical Report | API
Intended Usage & Model Info
jina-embeddings-v4 is a universal embedding model for multimodal and multilingual retrieval.
The model is specially designed for complex document retrieval, including visually rich documents with charts, tables, and illustrations.
Built on Qwen/Qwen2.5-VL-3B-Instruct, jina-embeddings-v4 features:
- Unified embeddings for text, images, and visual documents, supporting both dense (single-vector) and late-interaction (multi-vector) retrieval.
- Multilingual support (30+ languages) and compatibility with a wide range of domains, including technical and visually complex documents.
- Task-specific adapters for retrieval, text matching, and code-related tasks, which can be selected at inference time.
- Flexible embedding size: dense embeddings are 2048 dimensions by default but can be truncated to as low as 128 with minimal performance loss.
Summary of features:
| Feature | Jina Embeddings V4 |
|---|---|
| Base Model | Qwen2.5-VL-3B-Instruct |
| Supported Tasks | retrieval, text-matching, code |
| Model DType | BFloat 16 |
| Max Sequence Length | 32768 |
| Single-Vector Dimension | 2048 |
| Multi-Vector Dimension | 128 |
| Matryoshka dimensions | 128, 256, 512, 1024, 2048 |
| Pooling Strategy | Mean pooling |
| Attention Mechanism | FlashAttention2 |
Training & Evaluation
Please refer to our technical report of jina-embeddings-v4 for training details and benchmarks.
Usage
Jina-VDR
Alongside jina-embeddings-v4, we’re releasing Jina VDR, a multilingual, multi-domain benchmark for visual document retrieval. The task collection can be viewed here, and evaluation instructions can be found here.
License
This model was initially released under cc-by-nc-4.0 due to an error. The correct license is the Qwen Research License, as this model is derived from Qwen-2.5-VL-3B which is governed by that license.
Contact
Join our Discord community and chat with other community members about ideas.
Citation
If you find jina-embeddings-v4 useful in your research, please cite the following paper:
@misc{günther2025jinaembeddingsv4universalembeddingsmultimodal,
title={jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval},
author={Michael Günther and Saba Sturua and Mohammad Kalim Akram and Isabelle Mohr and Andrei Ungureanu and Sedigheh Eslami and Scott Martens and Bo Wang and Nan Wang and Han Xiao},
year={2025},
eprint={2506.18902},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2506.18902},
}
- Downloads last month
- 615,347
