Holo-3.1-4B-GGUF
Holo3.1: Fast & Local Computer Use Agents
Model Description
Holo3.1 is our latest family of Vision-Language Models (VLMs) for computer use agents. Building on Holo3, it expands support beyond browser and desktop automation to mobile environments, introduces native function-calling support for seamless integration with agent frameworks, and enables local deployment through optimized quantized checkpoints.
The Holo3.1 family spans model sizes from 0.8B to 35B-A3B parameters. Across computer use, UI grounding, mobile automation, and business workflows, Holo3.1 delivers strong performance while improving deployment flexibility and cost efficiency.
For more information, please visit the original model card: https://huggingface.co/Hcompany/Holo-3.1-4B
Model Files
| File Name | Quant Type | File Size | File Link |
|---|---|---|---|
| Holo-3.1-4B.BF16.gguf | BF16 | 9.7 GB | Download |
| Holo-3.1-4B.F16.gguf | F16 | 9.7 GB | Download |
| Holo-3.1-4B.Q2_K.gguf | Q2_K | 2.12 GB | Download |
| Holo-3.1-4B.Q3_K_L.gguf | Q3_K_L | 2.69 GB | Download |
| Holo-3.1-4B.Q3_K_M.gguf | Q3_K_M | 2.54 GB | Download |
| Holo-3.1-4B.Q3_K_S.gguf | Q3_K_S | 2.34 GB | Download |
| Holo-3.1-4B.Q4_0.gguf | Q4_0 | 2.9 GB | Download |
| Holo-3.1-4B.Q4_K_M.gguf | Q4_K_M | 3.07 GB | Download |
| Holo-3.1-4B.Q4_K_S.gguf | Q4_K_S | 2.92 GB | Download |
| Holo-3.1-4B.Q5_0.gguf | Q5_0 | 3.43 GB | Download |
| Holo-3.1-4B.Q5_K_M.gguf | Q5_K_M | 3.51 GB | Download |
| Holo-3.1-4B.Q5_K_S.gguf | Q5_K_S | 3.43 GB | Download |
| Holo-3.1-4B.Q6_K.gguf | Q6_K | 3.99 GB | Download |
| Holo-3.1-4B.Q8_0.gguf | Q8_0 | 5.16 GB | Download |
| Holo-3.1-4B.mmproj-bf16.gguf | mmproj-bf16 | 676 MB | Download |
| Holo-3.1-4B.mmproj-f16.gguf | mmproj-f16 | 676 MB | Download |
| Holo-3.1-4B.mmproj-q8_0.gguf | mmproj-q8_0 | 367 MB | Download |
llama.cpp
LLM inference in C/C++ — https://github.com/ggml-org/llama.cpp
FROM ghcr.io/ggml-org/llama.cpp:full
WORKDIR /app
RUN apt update && apt install -y python3-pip
RUN pip install -U huggingface_hub --break-system-packages
RUN python3 -c 'from huggingface_hub import hf_hub_download; \
repo="prithivMLmods/Holo-3.1-4B-GGUF"; \
hf_hub_download(repo_id=repo, filename="Holo-3.1-4B.Q5_K_M.gguf", local_dir="/app"); \
hf_hub_download(repo_id=repo, filename="Holo-3.1-4B.mmproj-bf16.gguf", local_dir="/app")'
CMD ["--server", \
"-m", "/app/Holo-3.1-4B.Q5_K_M.gguf", \
"--mmproj", "/app/Holo-3.1-4B.mmproj-bf16.gguf", \
"--host", "0.0.0.0", \
"--port", "7860", \
"-t", "2", \
"--cache-type-k", "q8_0", \
"--cache-type-v", "iq4_nl", \
"-c", "128000", \
"-n", "38912"]
- Downloads last month
- 1,369
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
