Remix clone Hacker News

new | show | ask | jobs Github

	▲	schopra909 12 hours ago
		I think you nailed it. For us it’s classifiers that we train for very specific domains. You’d think it’d be better to just finetune a smaller non-LLM model, but empirically we find the LLM finetunes (like 7B) perform better.
	▲	moffkalast 10 hours ago \| parent [-]
		I think it's no surprise that any model that has a more general understanding of text performs better than some tiny ad-hoc classifier that blindly learns a couple of patterns and has no clue what it's looking at. It's going to fail in much weirder ways that make no sense, like old cnn-based vision models.