Controversial opinion from a casual user, but state-of-art LLMs now feel to me more intelligent then the average person on the steet. Also explains why training on more average-quality data (if there's any left) is not making improvements.

But LLMs are hamstrung by their harnesses. They are doing the equivalent of providing technical support via phone call: little to no context, and limited to a bidirectional stream of words (tokens). The best agent harnesses have the equivalent of vision-impairment accessibility interfaces, and even those are still subpar.

Heck, giving LLMs time to think was once a groundbreaking idea. Yesterday I saw Claude Code editing a file using shell redirects! It's barbaric.

I expect future improvements to come from harness improvements, especially around sub agents/context rollbacks (to work around the non-linear cost of context) and LLM-aligned "accessibility tools". That, or more synthetic training data.

▲

8note 5 hours ago | parent | next [-]

> But LLMs are hamstrung by their harnesses

entirely so. i think anthropic updated something about the compact algorithm recently, and its gone from working well over long times to basically garbage whenever a compact happens

▲

xyzsparetimexyz 14 hours ago | parent | prev [-]

Steet? Do you mean street? They're smarter in the same way a search engine is smarter.

▲

BoppreH 11 hours ago | parent [-]

Yes, "street". Typing from my phone, sorry.

And search engines are narrow tools that can only output copies of its dataset. An LLM is capable of surprisingly novel output, even if the exact level of creativity is heavily debated.

▲

xyzsparetimexyz 7 hours ago | parent [-]

Remixes aren't novel.

	▲	WhatIsDukkha 6 hours ago \| parent [-]
		Human cultures are remixes all the way down...