VOOZH about

URL: https://huggingface.co/pmadinei/Interlace-Qwen3-VL-8B-10pc

โ‡ฑ pmadinei/Interlace-Qwen3-VL-8B-10pc ยท Hugging Face


โœ‚๏ธ Interlace-Qwen3-VL-8B-10pc

๐Ÿ‘ Paper
๐Ÿ‘ Project Page
๐Ÿ‘ GitHub
๐Ÿ‘ Collection
๐Ÿ‘ CVPR 2026

This model was produced by INTERLACE, a layer-pruning framework for Vision-Language Models. 10% of the transformer layers in Qwen/Qwen3-VL-8B-Instruct were removed using triplet-based similarity analysis, and the remaining model was fine-tuned on 1% of FineVision for a single epoch.

94.0% average relative performance retained  |  10% layers dropped (3 of 36)  |  33 layers remaining

๐Ÿ“‹ Model Details

Property Value
Base Model Qwen/Qwen3-VL-8B-Instruct
Pruning Method INTERLACE (triplet-based interleaved pruning)
Pruning Ratio 10% (3 of 36 layers removed)
Remaining Layers 33
Hidden Size 4096
Fine-tuning Data 1% of FineVision (~240K samples)
Fine-tuning Epochs 1

๐Ÿš€ Usage

from transformers import AutoModelForImageTextToText, AutoProcessor

model = AutoModelForImageTextToText.from_pretrained(
 "pmadinei/Interlace-Qwen3-VL-8B-10pc",
 dtype="auto",
 device_map="auto",
 attn_implementation="flash_attention_2",
)
processor = AutoProcessor.from_pretrained("Qwen/Qwen3-VL-8B-Instruct")

messages = [
 {
 "role": "user",
 "content": [
 {"type": "image", "image": "path/to/image.jpg"},
 {"type": "text", "text": "Describe this image in detail."},
 ],
 }
]

inputs = processor.apply_chat_template(
 messages, tokenize=True, add_generation_prompt=True,
 return_dict=True, return_tensors="pt",
).to(model.device)

output = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(output[0], skip_special_tokens=True))

๐Ÿ“Š Performance

Relative performance compared to the unpruned baseline (% of baseline score, Chain-of-Thought enabled):

Category Benchmark Relative Perf.
Text/Chart AI2D 93.4%
Text/Chart ChartQA 98.4%
Text/Chart OCRBench 93.2%
Text/Chart TextVQA 96.3%
General VQA MMBench 93.2%
General VQA POPE 99.5%
General VQA RealWorldQA 94.5%
Perception HRBench4K 91.4%
Perception HRBench8K 87.1%
Perception V-Star 90.9%
Inst & Sci MIABench 95.1%
Inst & Sci ScienceQA 94.6%
Overall Average 94.0%

๐Ÿค— All INTERLACE Models

๐Ÿ“ Citation

@inproceedings{madinei2026interlace,
 title={Interlace: Interleaved layer pruning and efficient adaptation in large vision-language models},
 author={Madinei, Parsa and Solgi, Ryan and Wen, Ziqi and Skaza, Jonathan and Eckstein, Miguel and Pedarsani, Ramtin},
 booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
 pages={2947--2956},
 year={2026}
}
Downloads last month
32
Safetensors
Model size
745k params
Tensor type
BF16
ยท

Model tree for pmadinei/Interlace-Qwen3-VL-8B-10pc

Finetuned
(323)
this model

Dataset used to train pmadinei/Interlace-Qwen3-VL-8B-10pc

Collection including pmadinei/Interlace-Qwen3-VL-8B-10pc

Evaluation results