Remix clone Hacker News

new | show | ask | jobs Github

	▲	dkhenry 6 hours ago
		You make a compelling argument, but thankfully I have data to back up my anecdotal experience This comparison shows them neck and neck https://benchlm.ai/compare/claude-sonnet-4-5-vs-gemma-4-31b As Does this one https://llm-stats.com/models/compare/claude-sonnet-4-6-vs-ge... And the pelican benchmark even shows them pretty close https://simonwillison.net/2026/Apr/2/gemma-4/ https://simonwillison.net/2025/Sep/29/claude-sonnet-4-5/ Also this isn't a fringe statement, you can see most people who have done an evaluation agree with me
	▲	jmward01 4 hours ago \| parent [-]
		I think one area I find hard to get around is context length. Everything self hosted is so limited on length that it is marginal to use. Additionally I think that the tools (like claude code) are clearly in the training mix for Anthropic's models so they seem to get a boost over other models pushed into that environment. That being said, open source and local inference is -really- good and only going to get better. There is no doubt that the current frontier biz model is not sustainable.