I think it’s a mistake to think that we will be blindly going in this direction for many years and then suddenly collectively wake up and realize what have we done. It’s a great filter and a great opportunity.

If LLMs stop improving at the pace of the last few years (I believe they already are slowing down) then they will still manage to crank out billions lines of code which they themselves won’t be able to grep and reason through, leading to drop in quality and lost revenue for the companies that choose to go all-in with LLMs.

But let’s be realistic - modern LLMs are still a great and useful tool when used properly so they will stay. Our goal will be to keep them on track and reduce the negative impact of hallucinations.

As a result software industry will move away from large complex interconnected systems that have millions of features but only a few of them actively used, to small high quality targeted tools. Because their work will be easier to verify and to control the side effects.

▲

lelanthran 2 hours ago | parent | next [-]

> If LLMs stop improving at the pace of the last few years (I believe they already are slowing down)

Depending on how you measure "improvement" they already have or they never will :-/

Measuring capability of the model as a ratio of context length, you reach the limits at around 300k-400k tokens of context; after that you have diminishing returns. We passed this point.

Measuring capability purely by output, smarter harnesses in the future may unlock even more improvements in outputs; basically a twist on the "Sufficiently Smart Compiler" (https://wiki.c2.com/?SufficientlySmartCompiler=)

That's the two extremes but there's more on the spectrum in between.

▲

rgbrenner an hour ago | parent [-]

300k-400k isn’t the current limit if you create modules and/or organize the code reasonably.. for the same reason we do this for humans: it allows us to interact with a component without loading the internals into out context.

you can also execute larger tasks than this using subagents to divide the work so each segment doesn’t exceed the usable context window. i regular execute tasks that require hundreds of subagents, for example.

in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens. it just requires you to structure the work so it can be done effectively — not so dissimilar to what you would do for a person

▲

jmalicki an hour ago | parent [-]

That makes it not a context window.

How to organize code like you said, and how agents interact with it, to keep the actual context window small is the fundamental challenge.

	▲	lelanthran an hour ago \| parent [-]
		I keep getting surprised that people who are all-in on this (" i regular execute tasks that require hundreds of subagents ") don't have any idea of what is happening even a single layer below their interface to the LLM ("in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens.") I looked at that response by GP (rgbrenner) and refrained from replying because if someone is both running hundreds of agents at a time AND oblivious to what "context window" means, there is no possible sane discourse that would result from any engagement.

▲

leptons 2 hours ago | parent | prev [-]

I wish I got to hallucinate at work, and just get a pat on the head for constantly doing the wrong thing.

	▲	oompydoompy74 12 minutes ago \| parent \| next [-]
		The title for that is Director, VP, or CTO at any given large enterprise company.
	▲	pllbnk an hour ago \| parent \| prev \| next [-]
		Maybe I am unlucky but I had worked with too many developers who couldn't make a good decision if their life depended on it. LLMs at least know how to convince you of their decisions with strong arguments.
	▲	2ndorderthought 2 hours ago \| parent \| prev [-]
		I mean you can do that, but the job probably doesn't pay too much. Might enrich your spirituality though.