What's the best way to run this in production for online serving?
#5
by fikrikarim - opened
I usually use vLLM, but it doesn't support the nar version.
Hi @fikrikarim
Sadly, the model is not supported in vLLM yet.
I hope we'll have an implementation in the future.
I'll update if something changes.
