food-vlm-tiny-quality-v2 (merged full model)
Merged full model checkpoint produced from local fine-tuning for food image to nutrition JSON generation.
Training data
- Dataset: Codatta/MM-Food-100K
- Training approach: supervised fine-tuning on conversational vision-language examples
Intended output format
The model is optimized for JSON-only responses with ingredient and nutrition structure.
Quickstart
from transformers import AutoProcessor, AutoModelForImageTextToText
repo = "thyfriendlyfox/food-vlm-tiny-quality-v2-merged"
processor = AutoProcessor.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(repo, trust_remote_code=True)
- Downloads last month
- 2
Safetensors
Model size
6.29M params
Tensor type
BF16
·
