Yeah. The docs tell you that you should build it yourself, but…
but unlike cuda there's no custom kernels for inference in vllm repo...
I think