Samantha Qwen2 7B AWQ

Trained on 2x4090 using QLoRa and FSDP

LoRa

Launch Using VLLM

python -m vllm.entrypoints.openai.api_server \
 --model macadeliccc/Samantha-Qwen2-7B-AWQ \
 --chat-template ./examples/template_chatml.jinja \
 --quantization awq

from openai import OpenAI
# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
 api_key=openai_api_key,
 base_url=openai_api_base,
)

chat_response = client.chat.completions.create(
 model="macadeliccc/Samantha-Qwen2-7B-AWQ",
 messages=[
 {"role": "system", "content": "You are a helpful assistant."},
 {"role": "user", "content": "Tell me a joke."},
 ]
)
print("Chat response:", chat_response)

Prompt Template

<|im_start|>system
You are a friendly assistant.<|im_end|>
<|im_start|>user
What is the capital of France?<|im_end|>
<|im_start|>assistant
The capital of France is Paris.

Quants

👁 Built with Axolotl

Downloads last month: 5

Safetensors

Model size

8B params

Tensor type

I32

F16

Model tree for macadeliccc/Samantha-Qwen2-7B-AWQ

Base model

Qwen/Qwen2-7B

Quantized

(44)

this model

URL: https://huggingface.co/macadeliccc/Samantha-Qwen2-7B-AWQ

⇱ macadeliccc/Samantha-Qwen2-7B-AWQ · Hugging Face

Samantha Qwen2 7B AWQ

Launch Using VLLM

Prompt Template

Quants

Model tree for macadeliccc/Samantha-Qwen2-7B-AWQ

Datasets used to train macadeliccc/Samantha-Qwen2-7B-AWQ