🌟 We included all instructions on how to download, use, and reproduce our various kinds of models at this GitHub repo. If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!

The main branch contains the q8_0 GGUF files for Llama3-8B-Chinese-Chat-v2.1. If you want to use our q8_0 GGUF files for Llama3-8B-Chinese-Chat-v1, please refer to the v1 branch; if you want to use our q8_0 GGUF files for Llama3-8B-Chinese-Chat-v2, please refer to the v2 branch.

For optimal performance, we refrain from fine-tuning the model's identity. Thus, inquiries such as "Who are you" or "Who developed you" may yield random responses that are not necessarily accurate.

Updates

🚀🚀🚀 [May 6, 2024] We now introduce Llama3-8B-Chinese-Chat-v2.1! Compared to v1, the training dataset of v2.1 is 5x larger (~100K preference pairs), and it exhibits significant enhancements, especially in roleplay, function calling, and math capabilities! Compared to v2, v2.1 surpasses v2 in math and is less prone to including English words in Chinese responses. The training dataset of Llama3-8B-Chinese-Chat-v2.1 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1 or v2, you won't want to miss out on Llama3-8B-Chinese-Chat-v2.1!
🔥 We provide an online interactive demo for Llama3-8B-Chinese-Chat-v2 here. Have fun with our latest model!
🔥 We provide the official Ollama model for the q4_0 GGUF version of Llama3-8B-Chinese-Chat-v2.1 at wangshenzhi/llama3-8b-chinese-chat-ollama-q4! Run the following command for quick use of this model: ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q4.
🔥 We provide the official Ollama model for the q8_0 GGUF version of Llama3-8B-Chinese-Chat-v2.1 at wangshenzhi/llama3-8b-chinese-chat-ollama-q8! Run the following command for quick use of this model: ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q8.
🔥 We provide the official Ollama model for the f16 GGUF version of Llama3-8B-Chinese-Chat-v2.1 at wangshenzhi/llama3-8b-chinese-chat-ollama-fp16! Run the following command for quick use of this model: ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-fp16.
🔥 We provide the official q4_0 GGUF version of Llama3-8B-Chinese-Chat-v2.1 at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-4bit!
🔥 We provide the official q8_0 GGUF version of Llama3-8B-Chinese-Chat-v2.1 at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit!
🔥 We provide the official f16 GGUF version of Llama3-8B-Chinese-Chat-v2.1 at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-f16!

Model Summary

Llama3-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3-8B-Instruct model.

Developers: Shenzhi Wang*, Yaowei Zheng*, Guoyin Wang (in.ai), Shiji Song, Gao Huang. (*: Equal Contribution)

License: Llama-3 License
Base Model: Meta-Llama-3-8B-Instruct
Model Size: 8.03B
Context length: 8K

1. Introduction

This is the first model specifically fine-tuned for Chinese & English user through ORPO [1] based on the Meta-Llama-3-8B-Instruct model.

Compared to the original Meta-Llama-3-8B-Instruct model, our Llama3-8B-Chinese-Chat-v1 model significantly reduces the issues of "Chinese questions with English answers" and the mixing of Chinese and English in responses.

Compared to Llama3-8B-Chinese-Chat-v1, our Llama3-8B-Chinese-Chat-v2 model significantly increases the training data size (from 20K to 100K), which introduces great performance enhancement, especially in roleplay, tool using, and math.

[1] Hong, Jiwoo, Noah Lee, and James Thorne. "Reference-free Monolithic Preference Optimization with Odds Ratio." arXiv preprint arXiv:2403.07691 (2024).

Training framework: LLaMA-Factory.

Training details:

epochs: 2
learning rate: 5e-6
learning rate scheduler type: cosine
Warmup ratio: 0.1
cutoff len (i.e. context length): 8192
orpo beta (i.e. $\lambda$ in the ORPO paper): 0.05
global batch size: 128
fine-tuning type: full parameters
optimizer: paged_adamw_32bit

2. Usage

from llama_cpp import Llama

model = Llama(
 "/Your/Path/To/GGUF/File",
 verbose=False,
 n_gpu_layers=-1,
)

system_prompt = "You are a helpful assistant."

def generate_reponse(_model, _messages, _max_tokens=8192):
 _output = _model.create_chat_completion(
 _messages,
 stop=["<|eot_id|>", "<|end_of_text|>"],
 max_tokens=_max_tokens,
 )["choices"][0]["message"]["content"]
 return _output

# The following are some examples

messages = [
 {
 "role": "system",
 "content": system_prompt,
 },
 {"role": "user", "content": "写一首诗吧"},
]

print(generate_reponse(model, messages))

3. Examples

The following are some examples generated by Llama3-8B-Chinese-Chat-v2.1, including examples of role playing, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc.

For the examples generated by Llama3-8B-Chinese-Chat-v1, please refer to this link.

For the examples generated by Llama3-8B-Chinese-Chat-v2, please refer to this link.

Citation

If our Llama3-8B-Chinese-Chat is helpful, please kindly cite as:

@misc {shenzhi_wang_2024,
 author = {Wang, Shenzhi and Zheng, Yaowei and Wang, Guoyin and Song, Shiji and Huang, Gao},
 title = { Llama3-8B-Chinese-Chat (Revision 6622a23) },
 year = 2024,
 url = { https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat },
 doi = { 10.57967/hf/2316 },
 publisher = { Hugging Face }
}

Downloads last month: 1,271

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

8-bit

Model tree for shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Quantized

(271)

this model

Space using shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit 1

Collection including shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit

6 items • Updated Jul 26, 2024 • 8

Paper for shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit

Paper • 2403.07691 • Published Mar 12, 2024 • 73

URL: https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit

⇱ shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit · Hugging Face