Model Card for Llama-3.2-1B-Instruct-APIGen-FC-v0.1
This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on argilla-warehouse/apigen-synth-trl dataset, a version of argilla/Synth-APIGen-v0.1 ready to do SFT on top of it. It has been trained using TRL.
Quick start
This is a Fine tuned version of Llama-3.2-1B-Instruct model specific for Function Calling, to showcase how to fine tune a model on top of a dataset
like argilla/Synth-APIGen-v0.1.
Helper functions for the prompt and output parsing
Examples
The following examples show how to use the model with transformers, for different types of queries and depending on the availability of tools.
Example of simple function call:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "argilla-warehouse/Llama-3.2-1B-Instruct-APIGen-FC-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)
get_weather_api = {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, New York"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature to return"
}
},
"required": ["location"]
}
}
search_api = {
"name": "search",
"description": "Search for information on the internet",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query, e.g. 'latest news on AI'"
}
},
"required": ["query"]
}
}
available_tools = [get_weather_api, search_api]
query = "What's the weather like in New York in fahrenheit?"
messages = prepare_messages(query, tools=available_tools)
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
result = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=False)
response = parse_response(result)
# [{'name': 'get_weather', 'arguments': {'location': 'New York', 'unit': 'fahrenheit'}}]
Parallel function call
Multiple function call
Parallel multiple function call
Multi-turn function call
Irrelevance function call (examples when some data is missing)
Training procedure
👁 Visualize in Weights & Biases
This model was trained with SFT. You can take a look at sft.slurm to see the
training script, if you don't have access to a slurm cluster, it can be run jsut using the accelerate command. It took 13 minutes in a node with 8xH100.
To install the requirements, the following commands can be used:
uv venv .venv --python 3.11
source .venv/bin/activate
git clone https://github.com/huggingface/trl.git
uv pip install .
uv pip install wandb
uv pip install deepspeed
And login to your WandB and Hugging Face accounts to push both logs and the final model.
Framework versions
- TRL: 0.12.0.dev0
- Transformers: 4.45.1
- Pytorch: 2.4.1
- Datasets: 3.0.1
- Tokenizers: 0.20.0
Citations
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
- Downloads last month
- 59
