As you alluded to at the end of your post—I'm not really convinced 20k LOC is very limiting. How many lines of code can you fit in your working mental model of a program? Certainly less than 20k concrete lines of text at any given time.

In your working mental model, you have broad understandings of the broader domain. You have broad understandings of the architecture. You summarize broad sections of the program into simpler ideas. module_a does x, module_b does y, insane file c does z, and so on. Then there is the part of the software you're actively working on, where you need more concrete context.

So as you move towards the central task, the context becomes more specific. But the vague outer context is still crucial to the task at hand. Now, you can certainly find ways to summarize this mental model in an input to an LLM, especially with increasing context windows. But we probably need to understand how we would better present these sorts of things to achieve performance similar to a human brain, because the mechanism is very different.

▲

jacobr1 2 days ago | parent [-]

This is basically how claude code works today. You have it /init a description of the project structure into CLAUDE.md that is used for each invocation. There is some implicit knowledge in the project about common frameworks and languages. Then when working on something between the explicit and implicit knowledge and the task at hand it will grep for relevant material in the project, load either full or parts of files, and THEN it will start working on the task. But it dynamically builds the context of the codebase based on searching for the relevant bit. Short-circuiting this by having a good project summary makes it more efficient - but you don't need to literally copy in all the code files.

	▲	HarHarVeryFunny 2 days ago \| parent [-]
		Interesting - thanks!