FuadeAI-50M

A 50 million parameter causal language model trained for conversational chat, built on a GPT-2 architecture with a custom tokenizer.

Model Details

Property	Value
Parameters	51.5M
Architecture	GPT-2 (custom config)
Hidden size	512
Layers	8
Attention heads	8
Context length	1024 tokens
Tokenizer	GPT-2 + custom special tokens
Training precision	FP16

Special Tokens

Token	Purpose
`<\|startoftext\|>`	Beginning of conversation
`<user>` / `</user>`	Wraps user message
`<assistant>` / `</assistant>`	Wraps assistant response
`<\|endoftext\|>`	End of conversation

Training Data

LucidexAi/VIBE-2K
HuggingFaceTB/instruct-data-basics-smollm-H4
MuskumPillerum/General-Knowledge (4k random rows)
Custom synthetic dataset for identity and conversational grounding

How To Use

Transformers

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("Fu01978/FuadeAI-50M")
model = GPT2LMHeadModel.from_pretrained("Fu01978/FuadeAI-50M")
model.eval()

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Chat function
def chat(prompt, temperature=0.4, top_p=0.9, max_new_tokens=100):
 formatted = (
 f"{tokenizer.bos_token}"
 f"<user>{prompt}</user>"
 f"<assistant>"
 )
 inputs = tokenizer(formatted, return_tensors="pt").to(device)

 with torch.no_grad():
 output = model.generate(
 **inputs,
 max_new_tokens=max_new_tokens,
 do_sample=True,
 temperature=temperature,
 top_p=top_p,
 repetition_penalty=1.2,
 no_repeat_ngram_size=3,
 eos_token_id=tokenizer.eos_token_id,
 pad_token_id=tokenizer.pad_token_id,
 )

 generated = output[0][inputs["input_ids"].shape[-1]:]
 return tokenizer.decode(generated, skip_special_tokens=True).strip()

# Example usage
print(chat("Hello!"))
print(chat("Who invented the first telephone?"))
print(chat("Who are you?"))

Generation Tips

temperature=0.45 — balanced creativity and coherence (recommended)
temperature=0.2 — more focused and deterministic answers
temperature=0.8 — more creative but less reliable
repetition_penalty=1.2 — keeps responses from looping (recommended)
max_new_tokens=100 — increase for longer responses

Limitations

50M parameters is small — factual recall is imperfect and some answers may be incorrect. Always verify factual claims from this model.
Coverage of topics is limited compared to large-scale models.
Not suitable for factual research, medical/legal/financial advice, or any high-stakes decision making.
Context window — limited to 1024 tokens total (prompt + response).

Intended Use

Learning and experimentation with small language models
Lightweight conversational agent for low-stakes applications
Fine-tuning base for domain-specific chat applications

Downloads last month: 3

Safetensors

Model size

51.5M params

Tensor type

F32

Model tree for Fu01978/FuadeAI-50M

Quantizations

1 model

Datasets used to train Fu01978/FuadeAI-50M

Collection including Fu01978/FuadeAI-50M

A list of all small models (=<1B) that I have published. • 9 items • Updated Mar 2

URL: https://huggingface.co/Fu01978/FuadeAI-50M

⇱ Fu01978/FuadeAI-50M · Hugging Face

FuadeAI-50M

Model Details

Special Tokens

Training Data

How To Use

Transformers

Generation Tips

Limitations

Intended Use

Model tree for Fu01978/FuadeAI-50M

Datasets used to train Fu01978/FuadeAI-50M

Collection including Fu01978/FuadeAI-50M