MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. • 12 items • Updated • 37
This repository contains the data for the paper PAVE: Patching and Adapting Video Large Language Models.
Code: https://github.com/dragonlzm/PAVE
Citation [optional]
arxiv.org/abs/2503.19794
BibTeX:
@misc{liu2025pavepatchingadaptingvideo,
title={PAVE: Patching and Adapting Video Large Language Models},
author={Zhuoming Liu and Yiquan Li and Khoi Duc Nguyen and Yiwu Zhong and Yin Li},
year={2025},
eprint={2503.19794},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.19794},
}
- Downloads last month
- 1,262
Safetensors
Model size
97.7M params
Tensor type
F32
·
