VOOZH about

URL: https://huggingface.co/VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1

โ‡ฑ VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1 ยท Hugging Face


SauerkrautLM-ColQwen3-2b-v0.1

๐Ÿ‘ VAGO Solutions Logo

๐Ÿฅ‡ Best 128-dim Model in Medium (1-3B) Category | +1.01 over ColQwen2

SauerkrautLM-ColQwen3-2b-v0.1 achieves 90.24 NDCG@5 on ViDoRe v1, making it the #1 in the Medium (1-3B) category among 128-dim models - a significant +1.01 improvement over the baseline ColQwen2-v1.0.

๐Ÿ‘ ViDoRe v1 Benchmark - 128-dim Models

๐ŸŽฏ Why Visual Document Retrieval?

Traditional OCR-based retrieval loses layout, tables, and visual context. Our visual approach:

  • โœ… No OCR errors - Direct visual understanding
  • โœ… Layout-aware - Understands tables, forms, charts
  • โœ… End-to-end - Single model, no pipeline complexity

๐Ÿ† Key Achievements

Benchmark Score Rank (128-dim)
ViDoRe v1 90.24 #5
MTEB v1+v2 81.02 #6
ViDoRe v3 54.32 #5

Medium Category Comparison (1-3B, 128-dim)

Model Params Dim ViDoRe v1 MTEB v1+v2 ViDoRe v3
SauerkrautLM-ColQwen3-2b-v0.1 โญ 2.2B 128 90.24 81.02 54.32
colqwen2-v1.0 2.2B 128 89.23 79.74 44.18
SauerkrautLM-ColQwen3-1.7b-Turbo-v0.1 1.7B 128 88.89 77.94 48.76

#1 in Medium category on all three benchmarks!

Detailed Benchmark Results

Improvement over Baseline

Metric ColQwen3-2b ColQwen2-v1.0 Improvement
ViDoRe v1 90.24 89.23 +1.01
MTEB v1+v2 81.02 79.74 +1.28
ViDoRe v3 54.32 44.18 +10.14

๐Ÿ“‹ Summary Tables

128-dim Models Comparison

๐Ÿ‘ 128-dim Models Summary

Comparison vs High-dim Models

๐Ÿ‘ High-dim Comparison

โœจ Key Features

  • ๐Ÿฅ‡ #1 in Medium Category: Best 1-3B model among 128-dim models
  • ๐Ÿ“ˆ +1.01 over ColQwen2: Significant improvement over baseline
  • ๐Ÿ’พ Consumer GPU Ready: Only ~4.4GB VRAM
  • โšก Compact Embeddings: 128-dimensional
  • ๐ŸŒ Multilingual: 6 languages (EN, DE, FR, ES, IT, PT)

Model Details

Property Value
Base Model Qwen/Qwen3-VL-2B
Parameters 2.2B
Embedding Dimension 128
VRAM (bfloat16) ~4.4 GB
Max Context Length 262,144 tokens
License Apache 2.0

Training

Hardware & Configuration

Setting Value
GPUs 4x NVIDIA RTX 6000 Ada (48GB)
Effective Batch Size 256
Precision bfloat16

Datasets

Dataset Type Description
vidore/colpali_train_set Public ColPali training data
openbmb/VisRAG-Ret-Train-In-domain-data Public Visual RAG training data
llamaindex/vdr-multilingual-train Public Multilingual document retrieval
VAGO Multilingual Dataset 1 In-house Proprietary multilingual document-query pairs
VAGO Multilingual Dataset 2 In-house Proprietary multilingual document-query pairs

Installation & Usage

โš ๏ธ Important: Install our package first before loading the model:

pip install git+https://github.com/VAGOsolutions/sauerkrautlm-colpali
import torch
from PIL import Image
from sauerkrautlm_colpali.models import ColQwen3, ColQwen3Processor

model_name = "VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1"

model = ColQwen3.from_pretrained(
 model_name,
 torch_dtype=torch.bfloat16,
 attn_implementation="flash_attention_2",
 device_map="cuda:0",
).eval()

processor = ColQwen3Processor.from_pretrained(model_name)

images = [Image.open("document.png")]
queries = ["What is the main topic?"]

batch_images = processor.process_images(images).to(model.device)
batch_queries = processor.process_queries(queries).to(model.device)

with torch.no_grad():
 image_embeddings = model(**batch_images)
 query_embeddings = model(**batch_queries)

scores = processor.score(query_embeddings, image_embeddings)

๐Ÿ“Š Additional Benchmark Visualizations

MTEB v1+v2 Benchmark (128-dim Models)

๐Ÿ‘ MTEB v1+v2 Benchmark - 128-dim Models

ViDoRe v3 Benchmark (128-dim Models)

๐Ÿ‘ ViDoRe v3 Benchmark - 128-dim Models

Our Models vs High-dim Models

๐Ÿ‘ ViDoRe v1 - Our Models vs High-dim

Citation

@misc{sauerkrautlm-colpali-2025,
 title={SauerkrautLM-ColPali: Multi-Vector Vision Retrieval Models},
 author={David Golchinfar},
 organization={VAGO Solutions},
 year={2025},
 url={https://github.com/VAGOsolutions/sauerkrautlm-colpali}
}

Contact

Downloads last month
186
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1

Finetunes
1 model

Datasets used to train VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1

Spaces using VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1 10

Collection including VAGOsolutions/SauerkrautLM-ColQwen3-2b-v0.1