| 🤗 Models | 📊 Dataset | 📄Paper |
📢 The paper associated with this model has been accepted to the AAAI-26 Workshop on Personalization in the Era of Large Foundation Models (PerFM).
🚀 Human-Like-Llama3-8B-Instruct
This model is a fine-tuned version of mistralai/Mistral-Nemo-Instruct-2407, specifically optimized to generate more human-like and conversational responses.
The fine-tuning process employed both Low-Rank Adaptation (LoRA) and Direct Preference Optimization (DPO) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
The proccess of creating this models is detailed in the research paper “Enhancing Human-Like Responses in Large Language Models”.
🛠️ Training Configuration
- Base Model: Mistral-Nemo-Instruct-2407
- Framework: Axolotl v0.4.1
- Hardware: 2x NVIDIA A100 (80 GB) GPUs
- Training Time: ~3 hours 40 minutes
- Dataset: Synthetic dataset with ≈11,000 samples across 256 diverse topics
💬 Prompt Template
You can use Mistral-Nemo prompt template while using the model:
Mistral-Nemo
<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]
This prompt template is available as a chat template, which means you can format messages using the
tokenizer.apply_chat_template() method:
messages = [
{"role": "system", "content": "You are helpful AI asistant."},
{"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)
🤖 Models
| Model | Download |
|---|---|
| Human-Like-Llama-3-8B-Instruct | 🤗 HuggingFace |
| Human-Like-Qwen-2.5-7B-Instruct | 🤗 HuggingFace |
| Human-Like-Mistral-Nemo-Instruct | 🤗 HuggingFace |
🔄 Quantizationed versions
GGUF @bartowski
https://huggingface.co/bartowski/Human-Like-LLama3-8B-Instruct-GGUF
https://huggingface.co/bartowski/Human-Like-Qwen2.5-7B-Instruct-GGUF
https://huggingface.co/bartowski/Human-Like-Mistral-Nemo-Instruct-2407-GGUF
🎯 Benchmark Results
| Group | Model | Average | IFEval | BBH | MATH Lvl 5 | GPQA | MuSR | MMLU-PRO |
|---|---|---|---|---|---|---|---|---|
| Llama Models | Human-Like-Llama-3-8B-Instruct | 22.37 | 64.97 | 28.01 | 8.45 | 0.78 | 2.00 | 30.01 |
| Llama-3-8B-Instruct | 23.57 | 74.08 | 28.24 | 8.68 | 1.23 | 1.60 | 29.60 | |
| Difference (Human-Like) | -1.20 | -9.11 | -0.23 | -0.23 | -0.45 | +0.40 | +0.41 | |
| Qwen Models | Human-Like-Qwen-2.5-7B-Instruct | 26.66 | 72.84 | 34.48 | 0.00 | 6.49 | 8.42 | 37.76 |
| Qwen-2.5-7B-Instruct | 26.86 | 75.85 | 34.89 | 0.00 | 5.48 | 8.45 | 36.52 | |
| Difference (Human-Like) | -0.20 | -3.01 | -0.41 | 0.00 | +1.01 | -0.03 | +1.24 | |
| Mistral Models | Human-Like-Mistral-Nemo-Instruct | 22.88 | 54.51 | 32.70 | 7.62 | 5.03 | 9.39 | 28.00 |
| Mistral-Nemo-Instruct | 23.53 | 63.80 | 29.68 | 5.89 | 5.37 | 8.48 | 27.97 | |
| Difference (Human-Like) | -0.65 | -9.29 | +3.02 | +1.73 | -0.34 | +0.91 | +0.03 |
📊 Dataset
The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
- Human-like responses: Natural, conversational answers mimicking human dialogue.
- Formal responses: Structured and precise answers with a more formal tone.
The dataset has been open-sourced and is available at:
More details on the dataset creation process can be found in the accompanying research paper.
📝 Citation
@misc{çalık2025enhancinghumanlikeresponseslarge,
title={Enhancing Human-Like Responses in Large Language Models},
author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
year={2025},
eprint={2501.05032},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.05032},
}
- Downloads last month
- 26
Model tree for HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
Base model
mistralai/Mistral-Nemo-Base-2407Dataset used to train HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
Space using HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407 1
Collection including HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
Papers for HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard54.510
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard32.710
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard7.630
- acc_norm on GPQA (0-shot)Open LLM Leaderboard5.030
- acc_norm on MuSR (0-shot)Open LLM Leaderboard9.400
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard28.010
