❗️❗️❗️NOTICE: For optimal performance, we refrain from fine-tuning the model's identity. Thus, inquiries such as "Who are you" or "Who developed you" may yield random responses that are not necessarily accurate.

Updates

🚀🚀🚀 [May 26, 2024] We now introduce Mistral-7B-v0.3-Chinese-Chat, which is the first model fine-tuned specifically for Chinese and English users based on mistralai/Mistral-7B-Instruct-v0.3! Full-parameter fine-tuned on a mixed Chinese-English dataset of ~100K preference pairs, the Chinese ability of our Mistral-7B-v0.3-Chinese-Chat is significantly better than mistralai/Mistral-7B-Instruct-v0.3! Besides, our Mistral-7B-v0.3-Chinese-Chat has great performance in mathematics, roleplay, tool use, etc.
🔥 We provide the official q4 GGUF version of Mistral-7B-v0.3-Chinese-Chat at shenzhi-wang/Mistral-7B-v0.3-Chinese-Chat-q4!
🔥 We provide the official q8 GGUF version of Mistral-7B-v0.3-Chinese-Chat at shenzhi-wang/Mistral-7B-v0.3-Chinese-Chat-q8!
🔥 We provide the official f16 GGUF version of Mistral-7B-v0.3-Chinese-Chat at shenzhi-wang/Mistral-7B-v0.3-Chinese-Chat-f16!

Model Summary

Mistral-7B-v0.3-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the mistralai/Mistral-7B-Instruct-v0.3.

Developers: Shenzhi Wang*, Yaowei Zheng*, Guoyin Wang (in.ai), Shiji Song, Gao Huang. (*: Equal Contribution)

License: Apache License 2.0
Base Model: mistralai/Mistral-7B-Instruct-v0.3
Model Size: 7.25B
Context length: 32K

1. Introduction

This is the first model specifically fine-tuned for Chinese & English user based on the mistralai/Mistral-7B-Instruct-v0.3. The fine-tuning algorithm used is ORPO [1].

Compared to the original mistralai/Mistral-7B-Instruct-v0.3, our Mistral-7B-v0.3-Chinese-Chat model significantly reduces the issues of "Chinese questions with English answers" and the mixing of Chinese and English in responses.

[1] Hong, Jiwoo, Noah Lee, and James Thorne. "Reference-free Monolithic Preference Optimization with Odds Ratio." arXiv preprint arXiv:2403.07691 (2024).

Training framework: LLaMA-Factory.

Training details:

epochs: 3
learning rate: 3e-6
learning rate scheduler type: cosine
Warmup ratio: 0.1
cutoff len (i.e. context length): 32768
orpo beta (i.e. $\lambda$ in the ORPO paper): 0.05
global batch size: 128
fine-tuning type: full parameters
optimizer: paged_adamw_32bit

2. Usage

from transformers import pipeline

messages = [
 {
 "role": "system",
 "content": "You are a helpful assistant.",
 },
 {"role": "user", "content": "简要地介绍一下什么是机器学习"},
]
chatbot = pipeline(
 "text-generation",
 model="shenzhi-wang/Mistral-7B-v0.3-Chinese-Chat",
 max_length=32768,
)
print(chatbot(messages))

3. Examples

The following are some examples generated by our Mistral-7B-v0.3-Chinese-Chat, including examples of role playing, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc.

Citation

If our Mistral-7B-v0.3-Chinese-Chat is helpful, please kindly cite as:

@misc {shenzhi_wang_2024,
 author = {Wang, Shenzhi and Zheng, Yaowei and Wang, Guoyin and Song, Shiji and Huang, Gao},
 title = { Mistral-7B-v0.3-Chinese-Chat (Revision 754841d) },
 year = 2024,
 url = { https://huggingface.co/shenzhi-wang/Mistral-7B-v0.3-Chinese-Chat },
 doi = { 10.57967/hf/2317 },
 publisher = { Hugging Face }
}