Remix clone Hacker News

new | show | ask | jobs Github

	▲	resters 5 days ago
		Suppose there is an LLM that has a very small context size but reasons extremely well within it. That LLM would be useful for a different set of tasks than an LLM with a massive context that reasons somewhat less effectively. Any dimension of LLM training and inference can be thought of as a tradeoff that makes it better for some tasks, and worse for others. Maybe in some scenarios a heavily quantized model that returns a result in 10ms is more useful than one that returns a result in 200ms.