Remix clone Hacker News

new | show | ask | jobs Github

	▲	jorvi 3 hours ago
		Models have been capped out on training and (active) parameters a while ago, its tooling / harness that is making the big jumps in performance happen. And then you have things like DeepSeek with a pretty small KV cache. And with the extreme chip shortages for the next two years, there's little appetite for even bigger models anyway. Barring a breakthrough in scaling, the only direction the models can really go is smaller. Which will inevitably mean better performing local models for same chip budget.