qwen25-7b-empathy-new-weighted

qwen25-7b-empathy-new-weighted is a PEFT LoRA adapter fine-tuned from unsloth/Qwen2.5-7B-Instruct-bnb-4bit for empathetic, emotionally validating conversational responses.

The adapter is intended to be loaded together with the base Qwen2.5 7B Instruct model. It is not a standalone merged model.

Model Details

Model Description

This model was supervised fine-tuned to respond to user messages with emotional validation first, then gentle support. The prompt format includes an explicit detected emotion profile, optional conversation history, and the current user message.

Developed by: ychenrui
Shared by: ychenrui
Model type: PEFT LoRA adapter for causal language modeling
Base model: unsloth/Qwen2.5-7B-Instruct-bnb-4bit
Architecture: Qwen2.5 7B Instruct with LoRA adapters
Language: English training data; base model is multilingual
License: Apache 2.0, following the base model license
Repository: ychenrui/training-empathy-llm

Intended Use

Direct Use

Use this adapter for empathetic conversational response generation, especially when an upstream emotion classifier or application has already identified the user's likely emotional state.

Good fit examples:

Supportive chatbot responses
Empathy-style conversation demos
Research prototypes for affect-aware response generation
Controlled local experiments with emotional support phrasing

Downstream Use

This adapter can be integrated into a larger assistant that supplies:

A short emotion profile
Recent conversation history
The user's current message
Additional safety and escalation logic

Out-of-Scope Use

This model should not be used as a replacement for professional mental health care, emergency support, medical advice, legal advice, diagnosis, or crisis intervention. It should not be used to manipulate users, impersonate clinicians, make high-stakes decisions, or encourage dependency on an AI system.

For crisis or self-harm contexts, applications should use dedicated safety handling and route users to immediate support. In the United States, users in crisis can call or text 988.

Bias, Risks, and Limitations

The model can still hallucinate, misunderstand emotional context, over-validate harmful beliefs, miss crisis signals, or produce generic comfort. It may reflect biases present in the base model and in public conversational datasets. It was trained for supportive wording, not clinical reasoning.

Known limitations:

No formal clinical validation
No claim of therapeutic efficacy
Limited evaluation beyond validation loss and local smoke tests
May perform worse outside English or outside everyday support conversations
May be sensitive to prompt format changes

Recommendations

Use this model with application-level safeguards, human review for sensitive deployments, and separate crisis detection. Downstream systems should clearly disclose that responses are AI-generated and should not be treated as professional advice.

How to Get Started

Install dependencies:

pip install unsloth peft transformers accelerate bitsandbytes torch

Load the adapter with Unsloth:

from unsloth import FastLanguageModel
import torch

BASE_MODEL = "unsloth/Qwen2.5-7B-Instruct-bnb-4bit"
ADAPTER_MODEL = "YOUR_USERNAME/qwen25-7b-empathy-new-weighted"

SYSTEM_PROMPT = (
 "You are a warm and empathetic AI assistant. "
 "Always respond with emotional support first. "
 "Acknowledge and validate the user's feelings before asking questions. "
 "Keep responses 2-4 sentences."
)

def build_user_content(emotion, message, history=None):
 history = history or []
 history_text = "\n".join(
 f"{turn['role']}: {turn['content']}" for turn in history
 )
 history_block = f"Conversation history:\n{history_text}\n\n" if history_text else ""
 return (
 f"Detected emotion profile: {emotion}\n\n"
 + history_block
 + f"User message:\n<user>{message}</user>\n\n"
 + "Respond with emotional validation first, then gentle support. "
 "If multiple emotions are present, acknowledge the main emotion "
 "and reflect the others naturally."
 )

model, tokenizer = FastLanguageModel.from_pretrained(
 model_name=BASE_MODEL,
 max_seq_length=2048,
 dtype=torch.bfloat16,
 load_in_4bit=True,
)
model.load_adapter(ADAPTER_MODEL)
FastLanguageModel.for_inference(model)

messages = [
 {"role": "system", "content": SYSTEM_PROMPT},
 {
 "role": "user",
 "content": build_user_content(
 "sadness 0.46, worry 0.35, helplessness 0.11",
 "I just found out my dog is sick.",
 ),
 },
]

prompt = tokenizer.apply_chat_template(
 messages,
 tokenize=False,
 add_generation_prompt=True,
)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)

outputs = model.generate(
 **inputs,
 max_new_tokens=260,
 temperature=0.7,
 top_p=0.9,
 do_sample=True,
 pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(
 outputs[0][inputs["input_ids"].shape[-1]:],
 skip_special_tokens=True,
)
print(response.strip())

If you upload this adapter under a different Hugging Face username or organization, replace YOUR_USERNAME/qwen25-7b-empathy-new-weighted with that repository ID.

Training Details

Training Data

The local processed dataset contains 15,000 examples:

Split	Examples
Train	14,250
Validation	450
Test	300

Source mix used in the processed data:

Source	Examples
EmpatheticDialogues	8,250
ESConv	3,000
DailyDialog	1,500
CounselChat	1,500
Hand-written safety examples	750

The training data was converted into chat-style supervised fine-tuning records. Each record contains a system message, a user message with the detected emotion profile and optional history, and an assistant response.

Training Procedure

The adapter was trained with QLoRA using Unsloth and TRL supervised fine-tuning.

Preprocessing

The preprocessing pipeline:

Normalized user and assistant text
Converted multiple conversational datasets into one chat format
Preserved recent conversation history when available
Filtered very short user and assistant turns
Added hand-written safety examples for crisis-response anchoring

Training Hyperparameters

Hyperparameter	Value
Base model	`unsloth/Qwen2.5-7B-Instruct-bnb-4bit`
Max sequence length	2048
Epochs	3
Train batch size per device	2
Eval batch size per device	2
Gradient accumulation steps	8
Learning rate	0.0001
Warmup ratio	0.03
Weight decay	0.01
Scheduler	cosine
Optimizer	adamw_8bit
LoRA rank	16
LoRA alpha	16
LoRA dropout	0.0
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
Gradient checkpointing	unsloth
Final global step	2673

Speeds, Sizes, Times

Final adapter file size: about 154 MB
Final checkpoint step: 2673
Hardware and wall-clock training time were not recorded in the model card metadata.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluation used the held-out validation split generated by the local preprocessing pipeline. A separate 300-example test split was also produced, but no formal benchmark results are reported here.

Factors

No disaggregated evaluation by demographic group, emotion label, dataset source, or crisis category has been completed.

Metrics

Validation loss was tracked during training. Local qualitative smoke tests were also used for empathy-style responses, but they should not be interpreted as a formal safety or clinical evaluation.

Results

Step	Validation loss
250	2.1326
500	2.0806
750	2.0574
1000	2.0429
1250	2.0347
1500	2.0259
1750	2.0206
2000	2.0604
2250	2.0632
2500	2.0655
2673	2.0661

Best observed validation loss in the recorded training log was 2.0206 at step 1750. The final adapter was saved after 3 epochs at step 2673.

Summary

The model was optimized for warmer, more emotionally validating responses. It still requires robust downstream safety checks for crisis, medical, legal, or other high-stakes use.

Environmental Impact

Carbon emissions were not measured for this run.

Hardware Type: Not recorded; training config was designed for a single RTX 4090-class GPU
Hours used: Not recorded
Cloud Provider: Not recorded
Compute Region: Not recorded
Carbon Emitted: Not measured

Technical Specifications

Model Architecture and Objective

This is a LoRA adapter for Qwen2.5 7B Instruct. It was trained with a causal language modeling objective on chat-formatted supervised examples. The adapter targets the attention and MLP projection modules.

Compute Infrastructure

Hardware

Not recorded in the model metadata.

Software

PEFT: 0.19.1
TRL: 0.24.0
Transformers: 5.5.0
PyTorch: 2.10.0+cu126
Datasets: 4.3.0
Tokenizers: 0.22.2
Unsloth

Citation

If you use this adapter, cite the base model and the training libraries where appropriate.

Qwen2.5:

@misc{qwen2.5,
 title = {Qwen2.5: A Party of Foundation Models},
 url = {https://qwenlm.github.io/blog/qwen2.5/},
 author = {Qwen Team},
 month = {September},
 year = {2024}
}

TRL:

@misc{vonwerra2022trl,
 title = {{TRL: Transformer Reinforcement Learning}},
 author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouedec},
 year = {2020},
 publisher = {GitHub},
 howpublished = {\url{https://github.com/huggingface/trl}}
}

More Information

Base model: unsloth/Qwen2.5-7B-Instruct-bnb-4bit
Project repository: ychenrui/training-empathy-llm

Model Card Authors

ychenrui

Model Card Contact

Open an issue in the project repository: ychenrui/training-empathy-llm

Downloads last month: -

Model tree for JamieYCR/qwen25-7b-empathy-new-weighted

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Quantized

unsloth/Qwen2.5-7B-Instruct-bnb-4bit

Adapter

(59)

this model

URL: https://huggingface.co/JamieYCR/qwen25-7b-empathy-new-weighted

⇱ JamieYCR/qwen25-7b-empathy-new-weighted · Hugging Face