Remix clone Hacker News

new | show | ask | jobs Github

	▲	hahn-kev 3 days ago
		I wonder if the people saying that would agree that they've been improving.
	▲	nyonyo 2 days ago \| parent [-]
		As one on that side of that argument, I have to say I have yet to see LLMs fundamentally improve, rather than being benchmaxxed on a new set of common "trick questions" and giving off the illusion of reasoning. Add an extra leg to any animal in a picture. Ask the vision LLM to tell you how many legs it sees. It will answer the same amount as a person would expect from a healthy individual, because it's not actually reasoning, it's not perceiving anything, it's pattern matching. It sees dog, it answers 4 legs. Maybe sometime in the future it won't do that, because they will add this kind of trick to their benchmaxxing set (training LLMs specifically on pictures that have less or more legs than the animal should), as they do every time there's a new generation of those illusory things. But that won't fix the fundamental that these things DO NOT REASON. Training LLMs on sets of thousands and thousands and thousands of reasoning trick questions people ask on LM arena is borderline scamming people on the true nature of this technology. If we lived in a sane regulatory environment OAI would have a lot to answer for.