VOOZH about

URL: https://huggingface.co/ibm-granite/granite-speech-4.1-2b-nar/discussions/5

โ‡ฑ ibm-granite/granite-speech-4.1-2b-nar ยท What's the best way to run this in production for online serving?


What's the best way to run this in production for online serving?

#5
by fikrikarim - opened

I usually use vLLM, but it doesn't support the nar version.

Hi @fikrikarim
Sadly, the model is not supported in vLLM yet.
I hope we'll have an implementation in the future.
I'll update if something changes.

ยท Sign up or log in to comment