VOOZH about

URL: https://huggingface.co/Ibikemi/gemma-3-4b-african-finetuned

⇱ Ibikemi/gemma-3-4b-african-finetuned · Hugging Face


Gemma 3 4B Fine-tuned on African Language Datasets

This model is a fine-tuned version of google/gemma-3-4b-it on three African language datasets:

  • AfriMGSM: Math word problems in African languages
  • AfriMMLU: Multiple-choice questions in African languages
  • AfriSenti: Sentiment analysis for African languages

Model Description

  • Base Model: google/gemma-3-4b-it
  • Fine-tuning Method: QLoRA (4-bit quantization + LoRA)
  • LoRA Rank: 32
  • Training Hardware: NVIDIA A30 24GB
  • Languages: Multiple African languages

Training Details

Datasets

  1. AfriMGSM: Mathematical reasoning in African languages
  2. AfriMMLU: General knowledge multiple-choice questions
  3. AfriSenti: Sentiment classification

Training Hyperparameters

  • Batch Size: 4
  • Gradient Accumulation Steps: 8 (Effective batch size: 32)
  • Learning Rate: 2e-4
  • LoRA Rank: 32
  • LoRA Alpha: 64
  • Max Sequence Length: 512
  • Epochs: 3

How to Use

Installation

pip install transformers peft torch

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
 "google/gemma-3-4b-it",
 torch_dtype=torch.bfloat16,
 device_map="auto"
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Ibikemi/gemma-3-4b-african-finetuned")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-4b-it")

# Generate
prompt = "<start_of_turn>user\nWhat is 5 + 3?<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Use Cases

  1. Math Problem Solving: Solve mathematical word problems in African languages
  2. Question Answering: Answer multiple-choice questions in African languages
  3. Sentiment Analysis: Analyze sentiment in African language text

Limitations

  • Inherits limitations from base Gemma 3 4B model
  • Performance may vary across different African languages
  • Best results on languages well-represented in training data

Citation

If you use this model, please cite the original datasets:

@misc{afrimgsm,
 title={AfriMGSM: African Mathematical Reasoning},
 author={Masakhane},
 year={2024}
}

@misc{afrimmlu,
 title={AfriMMLU: African Languages MMLU},
 author={Masakhane},
 year={2024}
}

@misc{afrisenti,
 title={AfriSenti: Sentiment Analysis for African Languages},
 author={HausaNLP},
 year={2023}
}

License

This model inherits the Gemma license from the base model.

Downloads last month
3

Model tree for Ibikemi/gemma-3-4b-african-finetuned

Adapter
(372)
this model

Datasets used to train Ibikemi/gemma-3-4b-african-finetuned