[NeurIPS 2025] Unifying Visual Understanding and Generation via Text-Aligned Representations • 5 items • Updated • 18
Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)
Unifying Visual Understanding and Generation via Text-Aligned Representations
👁 Project PageJiaming Han, Hao Chen†, Yang Zhao, Hanyu Wang, Qi Zhao, Ziyan Yang, Hao He, Xiangyu Yue‡, Lu Jiang‡
† Project Lead ‡ Corresponding Authors
👁 Tar Paper on arXiv
👁 Huggingface Model
👁 Huggingface Space
👁 Huggingface Space
👁 Image
Citation
@article{han2025tar,
title={Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations},
author={Han, Jiaming and Chen, Hao and Zhao, Yang and Wang, Hanyu and Zhao, Qi and Yang, Ziyan and He, Hao and Yue, Xiangyu and Jiang, Lu},
journal={arXiv preprint arXiv:2506.18898},
year={2025},
}
License
This project is licensed under the Apache 2.0 License.
- Downloads last month
- 47
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
