From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping

Xu He^1,* ^2,† ³ ¹ ¹
² ⁴ ² ² ^1,5,✉

¹Tsinghua University ²Kling Team, Kuaishou Technology ³Beihang University ⁴HKUST ⁵CUHK
^*Work done at Kling Team, Kuaishou Technology ^†Project leader ^✉Corresponding author

👁 Image
👁 Image
👁 Image

Please refer to the GitHub README for usage.

Paper: https://arxiv.org/abs/2512.25066
Project Page: https://hjrphoebus.github.io/X-Dub/
Code: https://github.com/KlingAIResearch/X-Dub

📌 TL;DR

X-Dub is a visual dubbing system that synchronizes a character's lip movements in a video to match arbitrary input audio. This repository hosts the public Wan-based X-Dub release and its pretrained weights.

🌟 Citation

Please cite our paper if you find our work helpful.

@article{he2025from,
 title={From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing},
 author={He, Xu and Zhang, Haoxian and Chen, Hejia and Zheng, Changyuan and Chen, Liyang and Tang, Songlin and Huang, Jiehui and Liu, Xiaoqiang and Wan, Pengfei and Wu, Zhiyong},
 journal={arXiv preprint arXiv:2512.25066},
 year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for KlingTeam/X-Dub

Paper • 2512.25066 • Published Dec 31, 2025 • 5

URL: https://huggingface.co/KlingTeam/X-Dub

⇱ KlingTeam/X-Dub · Hugging Face

From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping

📌 TL;DR

🌟 Citation

Paper for KlingTeam/X-Dub