VOOZH about

URL: https://huggingface.co/thyfriendlyfox/food-vlm-tiny-quality-v3-merged

⇱ thyfriendlyfox/food-vlm-tiny-quality-v3-merged · Hugging Face


food-vlm-tiny-quality-v3 (merged full model)

Merged full model checkpoint produced from local fine-tuning for food image to nutrition JSON generation.

Training data

  • Dataset: Codatta/MM-Food-100K
  • Training approach: supervised fine-tuning on conversational vision-language examples

Intended output format

The model is optimized for JSON-only responses with ingredient and nutrition structure.

Quickstart

from transformers import AutoProcessor, AutoModelForImageTextToText

repo = "thyfriendlyfox/food-vlm-tiny-quality-v3-merged"
processor = AutoProcessor.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(repo, trust_remote_code=True)
Downloads last month
2
Safetensors
Model size
6.29M params
Tensor type
BF16
·

Model tree for thyfriendlyfox/food-vlm-tiny-quality-v3-merged

Dataset used to train thyfriendlyfox/food-vlm-tiny-quality-v3-merged