VOOZH about

URL: https://x.com/togethercompute/status/2066936299836039645

⇱ Together AI on X: ".@DecagonAI cut voice agent cost per turn nearly 6x with Together AI. They moved from closed models to fine-tuned open models, while keeping latency low enough for real-time voice: β†’ <400ms p95 model latency per turn β†’ custom speculators and prompt caching β†’ optimized" / X


Post

Post

.@DecagonAI cut voice agent cost per turn nearly 6x with Together AI. They moved from closed models to fine-tuned open models, while keeping latency low enough for real-time voice: β†’ <400ms p95 model latency per turn β†’ custom speculators and prompt caching β†’ optimized serving on NVIDIA Blackwell β†’ weekly, sometimes daily model deployment velocity This is the closed-to-open shift: more control, better tokenomics, and production performance without being locked into proprietary APIs.
How Decagon Engineered Sub-Second Voice AI with Together AI
The challenge with running voice agents Voice latency is audible. Decagon’s leadership frames voice as the most demanding surface because latency is immediately perceptible. Long pauses create awkward...
Don't miss what's happening
People on X are the first to know.