qwen25-7b-empathy-new-weighted
qwen25-7b-empathy-new-weighted is a PEFT LoRA adapter fine-tuned from
unsloth/Qwen2.5-7B-Instruct-bnb-4bit
for empathetic, emotionally validating conversational responses.
The adapter is intended to be loaded together with the base Qwen2.5 7B Instruct model. It is not a standalone merged model.
Model Details
Model Description
This model was supervised fine-tuned to respond to user messages with emotional validation first, then gentle support. The prompt format includes an explicit detected emotion profile, optional conversation history, and the current user message.
- Developed by: ychenrui
- Shared by: ychenrui
- Model type: PEFT LoRA adapter for causal language modeling
- Base model:
unsloth/Qwen2.5-7B-Instruct-bnb-4bit - Architecture: Qwen2.5 7B Instruct with LoRA adapters
- Language: English training data; base model is multilingual
- License: Apache 2.0, following the base model license
- Repository:
ychenrui/training-empathy-llm
Intended Use
Direct Use
Use this adapter for empathetic conversational response generation, especially when an upstream emotion classifier or application has already identified the user's likely emotional state.
Good fit examples:
- Supportive chatbot responses
- Empathy-style conversation demos
- Research prototypes for affect-aware response generation
- Controlled local experiments with emotional support phrasing
Downstream Use
This adapter can be integrated into a larger assistant that supplies:
- A short emotion profile
- Recent conversation history
- The user's current message
- Additional safety and escalation logic
Out-of-Scope Use
This model should not be used as a replacement for professional mental health care, emergency support, medical advice, legal advice, diagnosis, or crisis intervention. It should not be used to manipulate users, impersonate clinicians, make high-stakes decisions, or encourage dependency on an AI system.
For crisis or self-harm contexts, applications should use dedicated safety handling and route users to immediate support. In the United States, users in crisis can call or text 988.
Bias, Risks, and Limitations
The model can still hallucinate, misunderstand emotional context, over-validate harmful beliefs, miss crisis signals, or produce generic comfort. It may reflect biases present in the base model and in public conversational datasets. It was trained for supportive wording, not clinical reasoning.
Known limitations:
- No formal clinical validation
- No claim of therapeutic efficacy
- Limited evaluation beyond validation loss and local smoke tests
- May perform worse outside English or outside everyday support conversations
- May be sensitive to prompt format changes
Recommendations
Use this model with application-level safeguards, human review for sensitive deployments, and separate crisis detection. Downstream systems should clearly disclose that responses are AI-generated and should not be treated as professional advice.
How to Get Started
Install dependencies:
pip install unsloth peft transformers accelerate bitsandbytes torch
Load the adapter with Unsloth:
from unsloth import FastLanguageModel
import torch
BASE_MODEL = "unsloth/Qwen2.5-7B-Instruct-bnb-4bit"
ADAPTER_MODEL = "YOUR_USERNAME/qwen25-7b-empathy-new-weighted"
SYSTEM_PROMPT = (
"You are a warm and empathetic AI assistant. "
"Always respond with emotional support first. "
"Acknowledge and validate the user's feelings before asking questions. "
"Keep responses 2-4 sentences."
)
def build_user_content(emotion, message, history=None):
history = history or []
history_text = "\n".join(
f"{turn['role']}: {turn['content']}" for turn in history
)
history_block = f"Conversation history:\n{history_text}\n\n" if history_text else ""
return (
f"Detected emotion profile: {emotion}\n\n"
+ history_block
+ f"User message:\n<user>{message}</user>\n\n"
+ "Respond with emotional validation first, then gentle support. "
"If multiple emotions are present, acknowledge the main emotion "
"and reflect the others naturally."
)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=BASE_MODEL,
max_seq_length=2048,
dtype=torch.bfloat16,
load_in_4bit=True,
)
model.load_adapter(ADAPTER_MODEL)
FastLanguageModel.for_inference(model)
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{
"role": "user",
"content": build_user_content(
"sadness 0.46, worry 0.35, helplessness 0.11",
"I just found out my dog is sick.",
),
},
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=260,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(
outputs[0][inputs["input_ids"].shape[-1]:],
skip_special_tokens=True,
)
print(response.strip())
If you upload this adapter under a different Hugging Face username or organization, replace YOUR_USERNAME/qwen25-7b-empathy-new-weighted with that repository ID.
Training Details
Training Data
The local processed dataset contains 15,000 examples:
| Split | Examples |
|---|---|
| Train | 14,250 |
| Validation | 450 |
| Test | 300 |
Source mix used in the processed data:
| Source | Examples |
|---|---|
| EmpatheticDialogues | 8,250 |
| ESConv | 3,000 |
| DailyDialog | 1,500 |
| CounselChat | 1,500 |
| Hand-written safety examples | 750 |
The training data was converted into chat-style supervised fine-tuning records. Each record contains a system message, a user message with the detected emotion profile and optional history, and an assistant response.
Training Procedure
The adapter was trained with QLoRA using Unsloth and TRL supervised fine-tuning.
Preprocessing
The preprocessing pipeline:
- Normalized user and assistant text
- Converted multiple conversational datasets into one chat format
- Preserved recent conversation history when available
- Filtered very short user and assistant turns
- Added hand-written safety examples for crisis-response anchoring
Training Hyperparameters
| Hyperparameter | Value |
|---|---|
| Base model | unsloth/Qwen2.5-7B-Instruct-bnb-4bit |
| Max sequence length | 2048 |
| Epochs | 3 |
| Train batch size per device | 2 |
| Eval batch size per device | 2 |
| Gradient accumulation steps | 8 |
| Learning rate | 0.0001 |
| Warmup ratio | 0.03 |
| Weight decay | 0.01 |
| Scheduler | cosine |
| Optimizer | adamw_8bit |
| LoRA rank | 16 |
| LoRA alpha | 16 |
| LoRA dropout | 0.0 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Gradient checkpointing | unsloth |
| Final global step | 2673 |
Speeds, Sizes, Times
- Final adapter file size: about 154 MB
- Final checkpoint step: 2673
- Hardware and wall-clock training time were not recorded in the model card metadata.
Evaluation
Testing Data, Factors & Metrics
Testing Data
Evaluation used the held-out validation split generated by the local preprocessing pipeline. A separate 300-example test split was also produced, but no formal benchmark results are reported here.
Factors
No disaggregated evaluation by demographic group, emotion label, dataset source, or crisis category has been completed.
Metrics
Validation loss was tracked during training. Local qualitative smoke tests were also used for empathy-style responses, but they should not be interpreted as a formal safety or clinical evaluation.
Results
| Step | Validation loss |
|---|---|
| 250 | 2.1326 |
| 500 | 2.0806 |
| 750 | 2.0574 |
| 1000 | 2.0429 |
| 1250 | 2.0347 |
| 1500 | 2.0259 |
| 1750 | 2.0206 |
| 2000 | 2.0604 |
| 2250 | 2.0632 |
| 2500 | 2.0655 |
| 2673 | 2.0661 |
Best observed validation loss in the recorded training log was 2.0206 at step 1750. The final adapter was saved after 3 epochs at step 2673.
Summary
The model was optimized for warmer, more emotionally validating responses. It still requires robust downstream safety checks for crisis, medical, legal, or other high-stakes use.
Environmental Impact
Carbon emissions were not measured for this run.
- Hardware Type: Not recorded; training config was designed for a single RTX 4090-class GPU
- Hours used: Not recorded
- Cloud Provider: Not recorded
- Compute Region: Not recorded
- Carbon Emitted: Not measured
Technical Specifications
Model Architecture and Objective
This is a LoRA adapter for Qwen2.5 7B Instruct. It was trained with a causal language modeling objective on chat-formatted supervised examples. The adapter targets the attention and MLP projection modules.
Compute Infrastructure
Hardware
Not recorded in the model metadata.
Software
- PEFT: 0.19.1
- TRL: 0.24.0
- Transformers: 5.5.0
- PyTorch: 2.10.0+cu126
- Datasets: 4.3.0
- Tokenizers: 0.22.2
- Unsloth
Citation
If you use this adapter, cite the base model and the training libraries where appropriate.
Qwen2.5:
@misc{qwen2.5,
title = {Qwen2.5: A Party of Foundation Models},
url = {https://qwenlm.github.io/blog/qwen2.5/},
author = {Qwen Team},
month = {September},
year = {2024}
}
TRL:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouedec},
year = {2020},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
More Information
- Base model:
unsloth/Qwen2.5-7B-Instruct-bnb-4bit - Project repository:
ychenrui/training-empathy-llm
Model Card Authors
ychenrui
Model Card Contact
Open an issue in the project repository: ychenrui/training-empathy-llm
- Downloads last month
- -
Model tree for JamieYCR/qwen25-7b-empathy-new-weighted
Base model
Qwen/Qwen2.5-7B