>No, LLMs will not get better.

What makes you so sure of this? They've been getting better like clockwork every few months for the past 5 years.

bigstrat2003 5 days ago | parent | next [-]

I don't claim that they won't get better, but they certainly haven't gotten better. From the original release of ChatGPT to now, they still suck in the same exact ways.

	▲	johnisgood 5 days ago \| parent [-]
		I don't think they have gotten better either (at least in the past 1 year), because I remember how much better ChatGPT or even Claude used to be before. Perhaps they are nerfed now for commercial use, who knows.

▲

otabdeveloper4 5 days ago | parent | prev [-]

No they haven't.

The hallucinate exactly as much as they did five years ago.

▲

atleastoptimal 5 days ago | parent | next [-]

Absolutely untrue. Claiming GPT-3 hallucinates as much as o3 over the same token horizon on the same prompts is a silly notion and easily disproven by the dozens of benchmarks. You can code a complete web-app with models now, something far beyond the means of models so long ago.

	▲	otabdeveloper4 5 days ago \| parent [-]
		> caveats and weasel words > "benchmarks" Stop drinking the coolaid and making excuses for LLM limitations, and learn to use the tools properly given their limits instead.

▲

antihero 5 days ago | parent | prev [-]

They really don’t though.

	▲	otabdeveloper4 5 days ago \| parent [-]
		Larger context lengths are awesome, but they don't fundamentally change the failure modes of LLMs.