arxiv:2406.18587

Nomic Embed Vision: Expanding the Latent Space

Published on Jun 6, 2024

👁 Image

Authors:

Abstract

Nomic-embed-vision and nomic-embed-text form a unified latent space for high-performance vision, language, and multimodal tasks.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

This technical report describes the training of nomic-embed-vision, a highly performant, open-code, open-weights image embedding model that shares the same latent space as nomic-embed-text. Together, nomic-embed-vision and nomic-embed-text form the first unified latent space to achieve high performance across vision, language, and multimodal tasks.

View arXiv page View PDF Add to collection