Remix.run Logo
philipkiely 8 days ago

TRT-LLM has its challenges from a DX perspective and yeah for Multi-modal we still use vLLM pretty often.

But for the kind of traffic we are trying to serve -- high volume and latency sensitive -- it consistently wins head-to-head in our benchmarking and we have invested a ton of dev work in the tooling around it.