3 items • Updated • 1
Model Card for LLaVa-Phi-2-3B
Model Details
Model Description
- Developed by: LAION, SkunkworksAI & Ontocord
- Model type: LLaVA is an open-source chatbot trained by fine-tuning Phi-2 on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture
- Finetuned from model: Phi-2
- License: MIT
- Demo: llava-phi-2-3b-demo
Model Sources
- Repository: BakLLaVa
Evaluation
Benchmarks
| Model | Parameters | SQA | GQA | TextVQA | POPE |
|---|---|---|---|---|---|
| LLaVA-1.5 | 7.3B | 68.0 | 62.0 | 58.3 | 85.3 |
| MC-LLaVA-3B | 3B | - | 49.6 | 38.59 | - |
| LLaVA-Phi | 3B | 68.4 | - | 48.6 | 85.0 |
| moondream1 | 1.6B | - | 56.3 | 39.8 | - |
| llava-phi-2-3b | 3B | 69.0 | 51.2 | 47.0 | 86.0 |
Image Captioning (MS COCO)
| Model | BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 | METEOR | ROUGE_L | CIDEr | SPICE |
|---|---|---|---|---|---|---|---|---|
| llava-1.5-7b | 75.8 | 59.8 | 45 | 33.3 | 29.4 | 57.7 | 108.8 | 23.5 |
| llava-phi-2-3b | 67.7 | 50.5 | 35.7 | 24.2 | 27.0 | 52.4 | 85.0 | 20.7 |
- Downloads last month
- 28
Safetensors
Model size
3B params
Tensor type
F32
·
Model tree for marianna13/llava-phi-2-3b
Quantizations
1 model