VOOZH about

URL: https://huggingface.co/MaziyarPanahi/neural-chat-7b-v3-3-GPTQ

⇱ MaziyarPanahi/neural-chat-7b-v3-3-GPTQ · Hugging Face


Description

MaziyarPanahi/neural-chat-7b-v3-3-GPTQ is a quantized (GPTQ) version of Intel/neural-chat-7b-v3-3

How to use

Install the necessary packages

pip install --upgrade accelerate auto-gptq transformers

Example Python code

from transformers import AutoTokenizer, pipeline
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import torch

model_id = "MaziyarPanahi/neural-chat-7b-v3-3-GPTQ"

quantize_config = BaseQuantizeConfig(
 bits=4,
 group_size=128,
 desc_act=False
 )

model = AutoGPTQForCausalLM.from_quantized(
 model_id,
 use_safetensors=True,
 device="cuda:0",
 quantize_config=quantize_config)

tokenizer = AutoTokenizer.from_pretrained(model_id)

pipe = pipeline(
 "text-generation",
 model=model,
 tokenizer=tokenizer,
 max_new_tokens=512,
 temperature=0.7,
 top_p=0.95,
 repetition_penalty=1.1
)

outputs = pipe("What is a large language model?")
print(outputs[0]["generated_text"])
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
I32
·
F16
·

Model tree for MaziyarPanahi/neural-chat-7b-v3-3-GPTQ

Finetuned
(7)
this model

Collection including MaziyarPanahi/neural-chat-7b-v3-3-GPTQ