VOOZH about

URL: https://huggingface.co/JamieYCR/qwen25-7b-empathy-new-weighted

⇱ JamieYCR/qwen25-7b-empathy-new-weighted · Hugging Face


qwen25-7b-empathy-new-weighted

qwen25-7b-empathy-new-weighted is a PEFT LoRA adapter fine-tuned from unsloth/Qwen2.5-7B-Instruct-bnb-4bit for empathetic, emotionally validating conversational responses.

The adapter is intended to be loaded together with the base Qwen2.5 7B Instruct model. It is not a standalone merged model.

Model Details

Model Description

This model was supervised fine-tuned to respond to user messages with emotional validation first, then gentle support. The prompt format includes an explicit detected emotion profile, optional conversation history, and the current user message.

  • Developed by: ychenrui
  • Shared by: ychenrui
  • Model type: PEFT LoRA adapter for causal language modeling
  • Base model: unsloth/Qwen2.5-7B-Instruct-bnb-4bit
  • Architecture: Qwen2.5 7B Instruct with LoRA adapters
  • Language: English training data; base model is multilingual
  • License: Apache 2.0, following the base model license
  • Repository: ychenrui/training-empathy-llm

Intended Use

Direct Use

Use this adapter for empathetic conversational response generation, especially when an upstream emotion classifier or application has already identified the user's likely emotional state.

Good fit examples:

  • Supportive chatbot responses
  • Empathy-style conversation demos
  • Research prototypes for affect-aware response generation
  • Controlled local experiments with emotional support phrasing

Downstream Use

This adapter can be integrated into a larger assistant that supplies:

  • A short emotion profile
  • Recent conversation history
  • The user's current message
  • Additional safety and escalation logic

Out-of-Scope Use

This model should not be used as a replacement for professional mental health care, emergency support, medical advice, legal advice, diagnosis, or crisis intervention. It should not be used to manipulate users, impersonate clinicians, make high-stakes decisions, or encourage dependency on an AI system.

For crisis or self-harm contexts, applications should use dedicated safety handling and route users to immediate support. In the United States, users in crisis can call or text 988.

Bias, Risks, and Limitations

The model can still hallucinate, misunderstand emotional context, over-validate harmful beliefs, miss crisis signals, or produce generic comfort. It may reflect biases present in the base model and in public conversational datasets. It was trained for supportive wording, not clinical reasoning.

Known limitations:

  • No formal clinical validation
  • No claim of therapeutic efficacy
  • Limited evaluation beyond validation loss and local smoke tests
  • May perform worse outside English or outside everyday support conversations
  • May be sensitive to prompt format changes

Recommendations

Use this model with application-level safeguards, human review for sensitive deployments, and separate crisis detection. Downstream systems should clearly disclose that responses are AI-generated and should not be treated as professional advice.

How to Get Started

Install dependencies:

pip install unsloth peft transformers accelerate bitsandbytes torch

Load the adapter with Unsloth:

from unsloth import FastLanguageModel
import torch

BASE_MODEL = "unsloth/Qwen2.5-7B-Instruct-bnb-4bit"
ADAPTER_MODEL = "YOUR_USERNAME/qwen25-7b-empathy-new-weighted"

SYSTEM_PROMPT = (
 "You are a warm and empathetic AI assistant. "
 "Always respond with emotional support first. "
 "Acknowledge and validate the user's feelings before asking questions. "
 "Keep responses 2-4 sentences."
)

def build_user_content(emotion, message, history=None):
 history = history or []
 history_text = "\n".join(
 f"{turn['role']}: {turn['content']}" for turn in history
 )
 history_block = f"Conversation history:\n{history_text}\n\n" if history_text else ""
 return (
 f"Detected emotion profile: {emotion}\n\n"
 + history_block
 + f"User message:\n<user>{message}</user>\n\n"
 + "Respond with emotional validation first, then gentle support. "
 "If multiple emotions are present, acknowledge the main emotion "
 "and reflect the others naturally."
 )

model, tokenizer = FastLanguageModel.from_pretrained(
 model_name=BASE_MODEL,
 max_seq_length=2048,
 dtype=torch.bfloat16,
 load_in_4bit=True,
)
model.load_adapter(ADAPTER_MODEL)
FastLanguageModel.for_inference(model)

messages = [
 {"role": "system", "content": SYSTEM_PROMPT},
 {
 "role": "user",
 "content": build_user_content(
 "sadness 0.46, worry 0.35, helplessness 0.11",
 "I just found out my dog is sick.",
 ),
 },
]

prompt = tokenizer.apply_chat_template(
 messages,
 tokenize=False,
 add_generation_prompt=True,
)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)

outputs = model.generate(
 **inputs,
 max_new_tokens=260,
 temperature=0.7,
 top_p=0.9,
 do_sample=True,
 pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(
 outputs[0][inputs["input_ids"].shape[-1]:],
 skip_special_tokens=True,
)
print(response.strip())

If you upload this adapter under a different Hugging Face username or organization, replace YOUR_USERNAME/qwen25-7b-empathy-new-weighted with that repository ID.

Training Details

Training Data

The local processed dataset contains 15,000 examples:

Split Examples
Train 14,250
Validation 450
Test 300

Source mix used in the processed data:

Source Examples
EmpatheticDialogues 8,250
ESConv 3,000
DailyDialog 1,500
CounselChat 1,500
Hand-written safety examples 750

The training data was converted into chat-style supervised fine-tuning records. Each record contains a system message, a user message with the detected emotion profile and optional history, and an assistant response.

Training Procedure

The adapter was trained with QLoRA using Unsloth and TRL supervised fine-tuning.

Preprocessing

The preprocessing pipeline:

  • Normalized user and assistant text
  • Converted multiple conversational datasets into one chat format
  • Preserved recent conversation history when available
  • Filtered very short user and assistant turns
  • Added hand-written safety examples for crisis-response anchoring

Training Hyperparameters

Hyperparameter Value
Base model unsloth/Qwen2.5-7B-Instruct-bnb-4bit
Max sequence length 2048
Epochs 3
Train batch size per device 2
Eval batch size per device 2
Gradient accumulation steps 8
Learning rate 0.0001
Warmup ratio 0.03
Weight decay 0.01
Scheduler cosine
Optimizer adamw_8bit
LoRA rank 16
LoRA alpha 16
LoRA dropout 0.0
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Gradient checkpointing unsloth
Final global step 2673

Speeds, Sizes, Times

  • Final adapter file size: about 154 MB
  • Final checkpoint step: 2673
  • Hardware and wall-clock training time were not recorded in the model card metadata.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluation used the held-out validation split generated by the local preprocessing pipeline. A separate 300-example test split was also produced, but no formal benchmark results are reported here.

Factors

No disaggregated evaluation by demographic group, emotion label, dataset source, or crisis category has been completed.

Metrics

Validation loss was tracked during training. Local qualitative smoke tests were also used for empathy-style responses, but they should not be interpreted as a formal safety or clinical evaluation.

Results

Step Validation loss
250 2.1326
500 2.0806
750 2.0574
1000 2.0429
1250 2.0347
1500 2.0259
1750 2.0206
2000 2.0604
2250 2.0632
2500 2.0655
2673 2.0661

Best observed validation loss in the recorded training log was 2.0206 at step 1750. The final adapter was saved after 3 epochs at step 2673.

Summary

The model was optimized for warmer, more emotionally validating responses. It still requires robust downstream safety checks for crisis, medical, legal, or other high-stakes use.

Environmental Impact

Carbon emissions were not measured for this run.

  • Hardware Type: Not recorded; training config was designed for a single RTX 4090-class GPU
  • Hours used: Not recorded
  • Cloud Provider: Not recorded
  • Compute Region: Not recorded
  • Carbon Emitted: Not measured

Technical Specifications

Model Architecture and Objective

This is a LoRA adapter for Qwen2.5 7B Instruct. It was trained with a causal language modeling objective on chat-formatted supervised examples. The adapter targets the attention and MLP projection modules.

Compute Infrastructure

Hardware

Not recorded in the model metadata.

Software

  • PEFT: 0.19.1
  • TRL: 0.24.0
  • Transformers: 5.5.0
  • PyTorch: 2.10.0+cu126
  • Datasets: 4.3.0
  • Tokenizers: 0.22.2
  • Unsloth

Citation

If you use this adapter, cite the base model and the training libraries where appropriate.

Qwen2.5:

@misc{qwen2.5,
 title = {Qwen2.5: A Party of Foundation Models},
 url = {https://qwenlm.github.io/blog/qwen2.5/},
 author = {Qwen Team},
 month = {September},
 year = {2024}
}

TRL:

@misc{vonwerra2022trl,
 title = {{TRL: Transformer Reinforcement Learning}},
 author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouedec},
 year = {2020},
 publisher = {GitHub},
 howpublished = {\url{https://github.com/huggingface/trl}}
}

More Information

Model Card Authors

ychenrui

Model Card Contact

Open an issue in the project repository: ychenrui/training-empathy-llm

Downloads last month
-

Model tree for JamieYCR/qwen25-7b-empathy-new-weighted

Base model

Qwen/Qwen2.5-7B
Adapter
(59)
this model

Datasets used to train JamieYCR/qwen25-7b-empathy-new-weighted