Ready-to-use GGUF quantizations for Ollama, llama.cpp, and local inference. • 8 items • Updated
olmOCR-2-7B-1025 (GGUF)
GGUF quantized version of allenai/olmOCR-2-7B-1025 — Allen AI's state-of-the-art OCR vision-language model, optimized for local inference with llama.cpp and Ollama.
olmOCR excels at extracting structured text from documents, PDFs, images, and handwriting — all running locally on your hardware.
Quick Start
With Ollama
ollama run hf.co/richardyoung/olmOCR-2-7B-1025-GGUF
With llama.cpp
huggingface-cli download richardyoung/olmOCR-2-7B-1025-GGUF \
--include "*Q8_0*" --local-dir ./models
./llama-cli -m ./models/*Q8_0*.gguf \
--image document.png \
-p "Extract all text from this document." \
-ngl 99
Why This Model?
- Best-in-class OCR: olmOCR outperforms many commercial OCR solutions on academic benchmarks
- Local & private: Process sensitive documents without sending them to cloud APIs
- Structured output: Extracts text with layout awareness — tables, columns, headers
- GGUF format: Runs on consumer hardware with llama.cpp (CPU or GPU)
Use Cases
- Extract text from scanned PDFs and documents
- Digitize handwritten notes
- Process invoices, receipts, and forms locally
- Build privacy-preserving document pipelines
Also Available
- olmOCR MLX 8-bit — Apple Silicon optimized
- olmOCR MLX 6-bit — Smaller footprint on Mac
- olmOCR MLX 4-bit — Minimum RAM on Mac
Other Models by richardyoung
- Abliterated/Uncensored models: Qwen2.5-7B | Qwen3-14B | DeepSeek-R1-32B | Qwen3-8B
- MLX quantizations (Apple Silicon): Kimi-K2 series | olmOCR MLX
- OCR & Vision: olmOCR GGUF
- Healthcare/Medical: Synthea 575K patients dataset | CardioEmbed
- Research: LLM Instruction-Following Evaluation (arxiv:2510.18892)
- Downloads last month
- 173
GGUF
Model size
8B params
Architecture
qwen2vl
Hardware compatibility
Log In to add your hardware
8-bit
