Description

MaziyarPanahi/neural-chat-7b-v3-3-GPTQ is a quantized (GPTQ) version of Intel/neural-chat-7b-v3-3

How to use

Install the necessary packages

pip install --upgrade accelerate auto-gptq transformers

Example Python code

from transformers import AutoTokenizer, pipeline
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import torch

model_id = "MaziyarPanahi/neural-chat-7b-v3-3-GPTQ"

quantize_config = BaseQuantizeConfig(
 bits=4,
 group_size=128,
 desc_act=False
 )

model = AutoGPTQForCausalLM.from_quantized(
 model_id,
 use_safetensors=True,
 device="cuda:0",
 quantize_config=quantize_config)

tokenizer = AutoTokenizer.from_pretrained(model_id)

pipe = pipeline(
 "text-generation",
 model=model,
 tokenizer=tokenizer,
 max_new_tokens=512,
 temperature=0.7,
 top_p=0.95,
 repetition_penalty=1.1
)

outputs = pipe("What is a large language model?")
print(outputs[0]["generated_text"])

Downloads last month: 3

Safetensors

Model size

1B params

Tensor type

I32

F16

Model tree for MaziyarPanahi/neural-chat-7b-v3-3-GPTQ

Base model

mistralai/Mistral-7B-v0.1

Finetuned

Intel/neural-chat-7b-v3-1

Finetuned

Intel/neural-chat-7b-v3-3

Finetuned

(7)

this model

Collection including MaziyarPanahi/neural-chat-7b-v3-3-GPTQ

quantized LLMs by AutoGPTQ • 28 items • Updated 23 days ago • 3

URL: https://huggingface.co/MaziyarPanahi/neural-chat-7b-v3-3-GPTQ

⇱ MaziyarPanahi/neural-chat-7b-v3-3-GPTQ · Hugging Face

Description

How to use

Install the necessary packages

Example Python code

Model tree for MaziyarPanahi/neural-chat-7b-v3-3-GPTQ

Collection including MaziyarPanahi/neural-chat-7b-v3-3-GPTQ