Remix.run Logo
tough 6 days ago

but unlike cuda there's no custom kernels for inference in vllm repo...

I think