Paper • 2512.25066 • Published • 5
From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping
Xu He1,*
2,†
3
1
1
2
4
2
2
1,5,✉
1Tsinghua University
2Kling Team, Kuaishou Technology
3Beihang University
4HKUST
5CUHK
*Work done at Kling Team, Kuaishou Technology
†Project leader
✉Corresponding author
Please refer to the GitHub README for usage.
- Paper: https://arxiv.org/abs/2512.25066
- Project Page: https://hjrphoebus.github.io/X-Dub/
- Code: https://github.com/KlingAIResearch/X-Dub
📌 TL;DR
X-Dub is a visual dubbing system that synchronizes a character's lip movements in a video to match arbitrary input audio. This repository hosts the public Wan-based X-Dub release and its pretrained weights.
🌟 Citation
Please cite our paper if you find our work helpful.
@article{he2025from,
title={From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing},
author={He, Xu and Zhang, Haoxian and Chen, Hejia and Zheng, Changyuan and Chen, Liyang and Tang, Songlin and Huang, Jiehui and Liu, Xiaoqiang and Wan, Pengfei and Wu, Zhiyong},
journal={arXiv preprint arXiv:2512.25066},
year={2025}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
