VOOZH about

URL: https://www.phoronix.com/news/Intel-llm-scaler-vllm-Whisper

⇱ Intel's New LLM-Scaler Beta Update Brings Whisper Model & GLM-4.5-Air Support - Phoronix


👁 Phoronix

Intel's New LLM-Scaler Beta Update Brings Whisper Model & GLM-4.5-Air Support

Written by Michael Larabel in Intel on 22 August 2025 at 05:59 AM EDT. Add A Comment
Earlier this month Intel released LLM-Scaler 1.0 as part of their Project Battlematrix initiative. This is a Docker container effort to deliver speedy AI inference performance with multi-GPU scaling and PCIe P2P support and more.

While there was the v1.0 announcement earlier this month, yesterday Intel software engineers released "0.9.0-b3" as a new beta release for the llm-scaler-vllm Docker build.

The updated LLM-Scaler vLLM beta enables Whisper model support, GLM-4.5-Air support, enables GLM-4.1V-9B-Thinking for image input, and enables the dots.ocr model. On top of supporting the additional models, yesterday's beta also optimized vLLM memory usage and enables the pipeline parallelism Ray back-end.

👁 LLM-Scaler vLLM beta


Downloads and more details on the new Intel LLM-Scaler vLLM release via GitHub.

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.