gemma-4-12B-it-heretic_decensored-GGUF
gemma-4-12B-it-heretic_decensored is a reasoning-capable language model built on top of google/gemma-4-12B-it and modified using the Heretic abliteration toolkit. The model applies refusal-direction analysis and targeted weight-space interventions to reduce internal refusal behaviors while preserving instruction-following, reasoning capabilities, and general conversational performance.
This model is intended strictly for research and learning purposes. Due to reduced internal refusal mechanisms, it may generate sensitive or unrestricted content. Users assume full responsibility for how the model is used. The authors and hosting platform disclaim any liability for generated outputs.
This model is experimental and may generate unexpected behaviors or artifacts in certain scenarios.
Model Files
| File Name | Quant Type | File Size | File Link |
|---|---|---|---|
| gemma-4-12B-it-heretic_decensored.BF16.gguf | BF16 | 23.8 GB | Download |
| gemma-4-12B-it-heretic_decensored.F16.gguf | F16 | 23.8 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q2_K.gguf | Q2_K | 4.83 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q3_K_L.gguf | Q3_K_L | 6.57 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q3_K_M.gguf | Q3_K_M | 6.09 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q3_K_S.gguf | Q3_K_S | 5.53 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q4_0.gguf | Q4_0 | 6.98 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q4_K_M.gguf | Q4_K_M | 7.38 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q4_K_S.gguf | Q4_K_S | 7.02 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q5_0.gguf | Q5_0 | 8.34 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q5_K_M.gguf | Q5_K_M | 8.55 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q5_K_S.gguf | Q5_K_S | 8.34 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q6_K.gguf | Q6_K | 9.79 GB | Download |
| gemma-4-12B-it-heretic_decensored.Q8_0.gguf | Q8_0 | 12.7 GB | Download |
| gemma-4-12B-it-heretic_decensored.mmproj-bf16.gguf | mmproj-bf16 | 175 MB | Download |
| gemma-4-12B-it-heretic_decensored.mmproj-f16.gguf | mmproj-f16 | 175 MB | Download |
| gemma-4-12B-it-heretic_decensored.mmproj-q8_0.gguf | mmproj-q8_0 | 159 MB | Download |
Quick Start with llama.cpp (Docker)
FROM ghcr.io/ggml-org/llama.cpp:full
WORKDIR /app
RUN apt update && apt install -y python3-pip
RUN pip install -U huggingface_hub --break-system-packages
RUN python3 -c 'from huggingface_hub import hf_hub_download; \
repo="prithivMLmods/gemma-4-12B-it-heretic_decensored-GGUF"; \
hf_hub_download(repo_id=repo, filename="gemma-4-12B-it-heretic_decensored.Q8_0.gguf", local_dir="/app"); \
hf_hub_download(repo_id=repo, filename="gemma-4-12B-it-heretic_decensored.mmproj-f16.gguf", local_dir="/app")'
CMD ["--server", \
"-m", "/app/gemma-4-12B-it-heretic_decensored.Q8_0.gguf", \
"--mmproj", "/app/gemma-4-12B-it-heretic_decensored.mmproj-f16.gguf", \
"--host", "0.0.0.0", \
"--port", "7860", \
"-t", "2", \
"--cache-type-k", "q8_0", \
"--cache-type-v", "iq4_nl", \
"-c", "128000", \
"-n", "38912"]
llama.cpp
LLM inference in C/C++ — https://github.com/ggml-org/llama.cpp
- Downloads last month
- 1,482
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for prithivMLmods/gemma-4-12B-it-heretic_decensored-GGUF
Base model
google/gemma-4-12B