Remix clone Hacker News

new | show | ask | jobs Github

	▲	philipkiely 8 days ago
		TRT-LLM has its challenges from a DX perspective and yeah for Multi-modal we still use vLLM pretty often. But for the kind of traffic we are trying to serve -- high volume and latency sensitive -- it consistently wins head-to-head in our benchmarking and we have invested a ton of dev work in the tooling around it.