2 items • Updated
Gemma-4-E2B-IT-SFT-RLVR-Medical
Gemma-4-E2B-it fine-tuned on PubMedQA using SFT and RLVR.
Also check out the training code on GitHub.
Quantized models are available here.
Setup
#!pip install transformers, torch, accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("lukasdrews/Gemma-4-E2B-IT-SFT-RLVR-Medical")
model = AutoModelForCausalLM.from_pretrained("lukasdrews/Gemma-4-E2B-IT-SFT-RLVR-Medical")
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Do GEC produce and bear factor H under complement attack?"}
]
},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Benchmarks
| Model | Quantization | PubMedQA (In-Domain) |
MedQA-USMLE (Zero-Shot Transfer) |
|---|---|---|---|
| Gemma-4-E2B-it (base model) | - | 58.10 % | 29.54 % |
| Gemma-4-E2B-it + SFT + RLVR | - | 73.10 % | 43.05 % |
| Gemma-4-E2B-it + SFT + RLVR | Q8_0 | 72.40 % | 43.00 % |
| Gemma-4-E2B-it + SFT + RLVR | Q6_K | 72.10 % | 42.18 % |
| Gemma-4-E2B-it + SFT + RLVR | Q5_K_M | 72.00 % | 38.88 % |
| Gemma-4-E2B-it + SFT + RLVR | Q4_K_M | 71.80 % | 38.88 % |
- Downloads last month
- 41
Safetensors
Model size
6B params
Tensor type
BF16
·
