VOOZH about

URL: https://huggingface.co/RichardErkhov/trollek_-_LittleInstructionMaker-4B-v0.1-gguf

⇱ RichardErkhov/trollek_-_LittleInstructionMaker-4B-v0.1-gguf · Hugging Face


YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Quantization made by Richard Erkhov.

Github

Discord

Request more models

LittleInstructionMaker-4B-v0.1 - GGUF

Name Quant method Size
LittleInstructionMaker-4B-v0.1.Q2_K.gguf Q2_K 1.41GB
LittleInstructionMaker-4B-v0.1.IQ3_XS.gguf IQ3_XS 0.21GB
LittleInstructionMaker-4B-v0.1.IQ3_S.gguf IQ3_S 0.14GB
LittleInstructionMaker-4B-v0.1.Q3_K_S.gguf Q3_K_S 0.13GB
LittleInstructionMaker-4B-v0.1.IQ3_M.gguf IQ3_M 0.34GB
LittleInstructionMaker-4B-v0.1.Q3_K.gguf Q3_K 0.25GB
LittleInstructionMaker-4B-v0.1.Q3_K_M.gguf Q3_K_M 0.24GB
LittleInstructionMaker-4B-v0.1.Q3_K_L.gguf Q3_K_L 0.08GB
LittleInstructionMaker-4B-v0.1.IQ4_XS.gguf IQ4_XS 0.0GB
LittleInstructionMaker-4B-v0.1.Q4_0.gguf Q4_0 0.04GB
LittleInstructionMaker-4B-v0.1.IQ4_NL.gguf IQ4_NL 0.16GB
LittleInstructionMaker-4B-v0.1.Q4_K_S.gguf Q4_K_S 1.22GB
LittleInstructionMaker-4B-v0.1.Q4_K.gguf Q4_K 2.23GB
LittleInstructionMaker-4B-v0.1.Q4_K_M.gguf Q4_K_M 2.23GB
LittleInstructionMaker-4B-v0.1.Q4_1.gguf Q4_1 2.33GB
LittleInstructionMaker-4B-v0.1.Q5_0.gguf Q5_0 2.55GB
LittleInstructionMaker-4B-v0.1.Q5_K_S.gguf Q5_K_S 1.14GB
LittleInstructionMaker-4B-v0.1.Q5_K.gguf Q5_K 2.62GB
LittleInstructionMaker-4B-v0.1.Q5_K_M.gguf Q5_K_M 2.62GB
LittleInstructionMaker-4B-v0.1.Q5_1.gguf Q5_1 2.78GB
LittleInstructionMaker-4B-v0.1.Q6_K.gguf Q6_K 3.03GB
LittleInstructionMaker-4B-v0.1.Q8_0.gguf Q8_0 3.92GB

Original model description:

license: apache-2.0 datasets: - Crystalcareai/openhermes_200k_unfiltered - mlabonne/orpo-dpo-mix-40k - jondurbin/airoboros-3.2 - abacusai/SystemChat-1.1 - trollek/SimpleInstructionJudge-v01 - cgato/SlimOrcaDedupCleaned language: - en library_name: transformers base_model: h2oai/h2o-danube3-4b-base tags: - mergekit - magpie

LittleInstructionMaker-4B-v0.1

A small model to create prompts the Magpie way.

The secret sauce turned out to be also training on the prompts. I did that last with SystemChat-1.1 in order to be able to steer the prompt generation. It does not work without a system message.

Now imagine, if you will, having this bad boy generate a bunch of different prompts right, and having another model like, I mean.. LittleInstructionJudge right, judge all of the instructions right, and then slam a serverfarm with the cream of the crop right.

In other words, giving it a system prompt like "You are a creative writing partner", "You are an advanced coding assistant", "You are a damn good psychologist", etc, you can can quickly generate prompts for a niche dataset that can then be answered by large model.

In a different language: Ved hjælp af Husskades indsigt, hvor man udnytter sprogmodellers natur til at skabe tilpasningsdata, kan man med fordel bruge denne sprogmodel til at skrive instruktioner, og endda styre indholdet ved hjælp af system beskeden.

Training

All the datasets were used seperately and merged together using Model Stock, except for SystemChat-1.1 where I fine-tuned it using LoRA+ with train_on_prompt set to True.

Datasets

Using this model to make instructions

<|im_start|>system
{{system_message}}<|im_end|>
<|im_start|>user

It actually generates an EOS token at the end of a "user" prompt. Lawdy that has been a pain when trying to use large models for this purpose. Good luck; have fun.

Response preview

Giving the model this text at a temperature of 0.9:

<|im_start|>system
You are an AI coding assistant.<|im_end|>
<|im_start|>user

Will return this:

Hey, can you help me write a simple program that generates a Fibonacci sequence until a certain number of terms? Like this: Fib(5) should give me the first five numbers in the series.

Code example to use it

import torch
from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
 "trollek/LittleInstructionMaker-4B-v0.1",
 dtype=torch.bfloat16,
 load_in_4bit=True,
 max_seq_length=8192
)
FastLanguageModel.for_inference(model)

def instruction_generator(system_message: str, num_instructions: int):
 if system_message is None or "":
 raise ValueError
 if num_instructions < 1:
 raise ValueError
 magpie_template = f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n"
 input_ids = tokenizer(magpie_template, return_tensors="pt").input_ids.to("cuda")
 for idx in range(num_instructions):
 generated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.9, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
 response = tokenizer.decode(generated_ids[0][input_ids.shape[-1]:], skip_special_tokens=True, clean_up_tokenization_space=True)
 yield response

for instruct in instruction_generator("You are an AI coding assistant.", 2):
 print(instruct)

# Can you help me write a simple programming language syntax?
# I want to create a Python program for a social media app that allows users to post and comment on stories. The message I want to convey is that staying connected with others is essential in life. Can you suggest a way to design the program?
Downloads last month
280
GGUF
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for RichardErkhov/trollek_-_LittleInstructionMaker-4B-v0.1-gguf