VOOZH about

URL: https://docs.ltx.video/welcome

⇱ API Documentation | LTX Documentation


Generate video with synchronized audio from text, images, and audio inputs. Two APIs to pick from: a sync API that returns the video in one HTTP call — simplest for short clips and quick experiments — and an async API that submits a job and polls for the result — recommended for production, where polling beats holding a long-lived connection.

Powered by the most downloaded open-source video model on Hugging Face. Engineered for real-world workloads with predictable performance at any volume. Stable outputs, consistent fidelity, and infrastructure-grade reliability.

LTX API Capabilities

All endpoints return video with synchronized audio — dialogue, music, and ambient sound are generated together with the visuals.

Text-to-Video

Generate video from a text description. Describe a scene, camera movement, and mood — the API returns a complete video with matching audio. Up to 4K resolution and 20 seconds per request.

Image-to-Video

Animate a still image with realistic motion, depth, and audio. Provide a reference image and a prompt describing the desired motion. The output preserves the visual identity of the source image.

Audio-to-Video

Generate video driven by an audio track. Supply dialogue, music, or ambient sound and the API produces visuals synchronized to the audio. Optionally condition on a reference image for visual direction.

Retake

Re-generate a specific section of an existing video without starting over. Select a time range and mode (replace video, audio, or both) to iterate on parts of a generation while keeping the rest intact.

Extend

Lengthen an existing video from the beginning or end. Provide a video, a duration, and a context window — the API generates new frames that continue seamlessly from the original, preserving audio and visual continuity.

Get Started