I disagree, I may not have the whole codebase in my head in one moment but I have had all of it in my head at some point, and it is still there, that is not true of an LLM. I use LLMs and am impressed by them, but they just do not approximate a human in this particular area.

My ability to break a problem down does not start from listing the files out and reading a few. I have a high level understanding of the whole project at all times, and a deep understanding of the whole project stored, and I can recall that when required, this is not true of an LLM at any point.

We know this is a limitation and it's why we have various tools attempting to approximate memory and augment training on the fly, but they are approximations and they are in my opinion, not even close to real human memory and depth of understanding for data it was not trained on.

Even for mutations of scenarios it was trained on, which code is a great example of that. It is trained on billions of lines of code, yet still fails to understand my codebase intuitively. I have definitely not read billions of lines of code.

▲

onion2k 3 days ago | parent | next [-]

My ability to break a problem down does not start from listing the files out and reading a few.

If you're completely new to the problem then ... yes, it does.

You're assuming that you're working on a project that you've spent time on and learned the domain for, and then you're comparing that to an LLM being prompted to look at a codebase with the context of the files. Those things are not the same though.

A closer analogy to LLMs would be prompting it for questions when it has access (either through MCP or training) to the project's git history, documentation, notes, issue tracker, etc. When that sort of thing is commonplace, and LLMs have the context window size to take advantage of all that information, I suspect we'll be surprised how good they are even given the results we get today.

▲

ehnto 3 days ago | parent [-]

> If you're completely new to the problem then ... yes, it does.

Of course, because I am not new to the problem, whereas an LLM is new to it every new prompt. I am not really trying to find a fair comparison because I believe humans have an unfair advantage in this instance, and am trying to make that point, rather than compare like for like abilities. I think we'll find even with all the context clues from MCPs and history etc. they might still fail to have the insight to recall the right data into the context, but that's just a feeling I have from working with Claude Code for a while. Because I instruct it to do those things, like look through git log, check the documentation etc, and it sometimes finds a path through to an insight but it's just as likely to get lost.

I alluded to it somewhere else but my experience with massive context windows so far has just been that it distracts the LLM. We are usually guiding it down a path with each new prompt and have a specific subset of information to give it, and so pumping the context full of unrelated code at the start seems to derail it from that path. That's anecdotal, though I encourage you to try messing around with it.

As always, there's a good chance I will eat my hat some day.

▲

scott_s 3 days ago | parent [-]

> Of course, because I am not new to the problem, whereas an LLM is new to it every new prompt.

That is true for the LLMs you have access to now. Now imagine if the LLM had been trained on your entire code base. And not just the code, but the entire commit history, commit messages and also all of your external design docs. And code and docs from all relevant projects. That LLM would not be new to the problem every prompt. Basically, imagine that you fine-tuned an LLM for your specific project. You will eventually have access to such an LLM.

▲

snowfield 2 days ago | parent | next [-]

AI training doesn't work like that. you don't train it on context, you train it on recognition and patterns.

	▲	scott_s 2 days ago \| parent [-]
		You train on data. Context is also data. If you want a model to have certain data, you can bake it into the model during training, or provide it as context during inference. But if the "context" you want the model to have is big enough, you're going to want to train (or fine-tune) on it. Consider that you're coding a Linux device driver. If you ask for help from an LLM that has never seen the Linux kernel code, has never seen a Linux device driver and has never seen all of the documentation from the Linux kernel, you're going to need to provide all of this as context. And that's both going to be onerous on you, and it might not be feasible. But if the LLM has already seen all of that during training, you don't need to provide it as context. Your context may be as simple as "I am coding a Linux device driver" and show it some of your code.

▲

jimbokun 3 days ago | parent | prev [-]

Why haven’t the bug AI companies been pursuing that approach, vs just ramping up context window size?

	▲	menaerus 2 days ago \| parent \| next [-]
		Well, we don't really know if they aren't doing exactly that for their internal code repos, right? Conceptually, there is no difference between fine-tuning the LLM for being a law expert of specific country and fine-tuning the LLM for being an expert for given codebase. Former is already happening and is public. Latter is not yet public but I believe it is happening. Reason why big co are pursuing generic LLMs is because they serve as a foundation for basically any other derivative and domain-specific work.
	▲	scott_s 2 days ago \| parent \| prev [-]
		Because training one family of models with very large context windows can be offered to the entire world as an online service. That is a very different business model from training or fine-tuning individual models specifically for individual customers. Someone will figure out how to do that at scale, eventually. It might require the cost of training to reduce significantly. But large companies with the resources to do this for themselves will do it, and many are doing it.

▲

ehnto 3 days ago | parent | prev | next [-]

Additionally, the more information you put into the context the more confused the LLM will get, if you did dump the whole codebase into the context it would not suddenly understand the whole thing. It is still an LLM, all you have done is polluted the context with a million lines of unrelated code, and some lines of related code, which it will struggle to find in the noise (in my experience of much smaller experiments)

	▲	Bombthecat 3 days ago \| parent [-]
		I call this context decay. :) The bigger the context, the more stuff "decays" sometimes to complete different meanings

▲

PaulDavisThe1st 3 days ago | parent | prev | next [-]

> I disagree, I may not have the whole codebase in my head in one moment but I have had all of it in my head at some point, and it is still there, that is not true of an LLM.

All 3 points (you have had all of it your head at some point, it is still there, that is not true of an LLM) are mere conjectures, and not provable at this time, certainly not in the general case. You may be able to show this of some codebases for some developers and for some LLMs, but not all.

▲

fnordsensei 3 days ago | parent | next [-]

The brain can literally not process any piece of information without being changed by the act of processing it. Neuronal pathways are constantly being reinforced or weakened.

Even remembering alters the memory being recalled, entirely unlike how computers work.

▲

Lutger 3 days ago | parent | next [-]

I've always find it interesting that once I take a wrong turn finding my way through the city and I'm not deliberate about remembering this was, in fact, a mistake, I am more prone to taking the same wrong turn again the next time.

	▲	dberge 3 days ago \| parent [-]
		> once I take a wrong turn finding my way through the city... I am more prone to taking the same wrong turn again You may want to stay home then to avoid getting lost.

▲

johnisgood 3 days ago | parent | prev [-]

For humans, remembering strengthens that memory, even if it is dead wrong.

▲

jbs789 3 days ago | parent | prev | next [-]

I'm not sure the idea that a developer maintains a high level understanding is all that controversial...

	▲	animuchan 3 days ago \| parent [-]
		The trend for this idea's controversiality is shown on this very small chart: /

▲

ehnto 3 days ago | parent | prev [-]

I never intended to say it was true of all codebases for all developers, that would make no sense. I don't know all developers.

I think it's objectively true that the information is not in the LLM. It did not have all codebases to train with, and they do not (immediately) retrain on the codebases they encounter through usage.

▲

xwolfi 3 days ago | parent | prev | next [-]

You only worked on very small codebase then. When you work on giant ones, you Ctrl+F a lot, build a limited model of the problem space, and pray the unit tests will catch anything you might have missed...

▲

akhosravian 3 days ago | parent | next [-]

And when you work on a really big codebase you start having multiple files and have to learn tools more advanced than ctrl-f!!

	▲	ghurtado 3 days ago \| parent [-]
		> and have to learn tools more advanced than ctrl-f!! Such as ctrl-shift-f But this is an advanced topic, I don't wanna get into it

▲

ehnto 3 days ago | parent | prev | next [-]

We're measuring lengths of string, but I would not say I have worked on small projects. I am very familiar with discovery, and have worked on a lot of large legacy projects that have no tests just fine.

▲

jimbokun 3 days ago | parent | prev [-]

Why are LLMs so bad at doing the same thing?

▲

airbreather 3 days ago | parent | prev | next [-]

you will have abstractions - black boxing, interface overviews etc, humans can only hold so much detail in current context memory, some say 7 items on average.

	▲	ehnto 3 days ago \| parent \| next [-]
		Of course, but even those blackoxes are not empty, they've got a vague picture inside them based on prior experience. I have been doing this for a while so most things are just various flavours of the same stuff, especially in enterprise software. The important thing in this context is that I know it's all there, I don't have to grep the codebase to fill up my context, and my understanding of the holistic project does not change each time I am booted up.
	▲	jimbokun 3 days ago \| parent \| prev [-]
		And LLMs can’t leverage these abstractions nearly as well as humans…so far.

▲

ivape 3 days ago | parent | prev [-]

My ability to break a problem down does not start from listing the files out and reading a few.

I does, it’s just happening at lightning speed.

	▲	CPLX 3 days ago \| parent [-]
		We don't actually know that. If we had that level of understanding of how exactly our brains do what they do things would be quite different.