Gemma 3 4B Fine-tuned on African Language Datasets
This model is a fine-tuned version of google/gemma-3-4b-it on three African language datasets:
- AfriMGSM: Math word problems in African languages
- AfriMMLU: Multiple-choice questions in African languages
- AfriSenti: Sentiment analysis for African languages
Model Description
- Base Model: google/gemma-3-4b-it
- Fine-tuning Method: QLoRA (4-bit quantization + LoRA)
- LoRA Rank: 32
- Training Hardware: NVIDIA A30 24GB
- Languages: Multiple African languages
Training Details
Datasets
- AfriMGSM: Mathematical reasoning in African languages
- AfriMMLU: General knowledge multiple-choice questions
- AfriSenti: Sentiment classification
Training Hyperparameters
- Batch Size: 4
- Gradient Accumulation Steps: 8 (Effective batch size: 32)
- Learning Rate: 2e-4
- LoRA Rank: 32
- LoRA Alpha: 64
- Max Sequence Length: 512
- Epochs: 3
How to Use
Installation
pip install transformers peft torch
Loading the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-3-4b-it",
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Ibikemi/gemma-3-4b-african-finetuned")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-4b-it")
# Generate
prompt = "<start_of_turn>user\nWhat is 5 + 3?<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Use Cases
- Math Problem Solving: Solve mathematical word problems in African languages
- Question Answering: Answer multiple-choice questions in African languages
- Sentiment Analysis: Analyze sentiment in African language text
Limitations
- Inherits limitations from base Gemma 3 4B model
- Performance may vary across different African languages
- Best results on languages well-represented in training data
Citation
If you use this model, please cite the original datasets:
@misc{afrimgsm,
title={AfriMGSM: African Mathematical Reasoning},
author={Masakhane},
year={2024}
}
@misc{afrimmlu,
title={AfriMMLU: African Languages MMLU},
author={Masakhane},
year={2024}
}
@misc{afrisenti,
title={AfriSenti: Sentiment Analysis for African Languages},
author={HausaNLP},
year={2023}
}
License
This model inherits the Gemma license from the base model.
- Downloads last month
- 3
