Voozh

VOOZH

about

URL: https://x.com/togethercompute/status/2066672024165274064

⇱ Together AI on X: "Optimizing GLM 5.1 came down to three things: -> Rewrote the indexer topk kernel -> Fused the indexer kernel to reduce memory and launch overhead -> Eliminated CPU overhead that was gating prefill throughput The bigger win was in the indexer. Once we fixed that, the rest made https://t.co/a9LH1aJ8pi" / X

Post

👁 user avatar

Together AI

@togethercompute

Optimizing GLM 5.1 came down to three things: -> Rewrote the indexer topk kernel -> Fused the indexer kernel to reduce memory and launch overhead -> Eliminated CPU overhead that was gating prefill throughput The bigger win was in the indexer. Once we fixed that, the rest made it even flable on Together AI.

👁 Image

11:59 PM · Jun 15, 202615.3KViews

Don't miss what's happening

People on X are the first to know.