Pinned Loading
-
LightCompress Public
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
-
Qwen-Image-Lightning Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
-
Wan2.2-Lightning Public
Forked from Wan-Video/Wan2.2
Wan2.2-Lightning: Speed up wan2.2 model with distillation
Repositories
Showing 10 of 72 repositories
-
-
- LightCompress Public
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
-
- SageAttention3-sparse Public Forked from thu-ml/SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
-
People
This organization has no public members. You must be a member to see who’s a part of this organization.
