Remix.run Logo
wldlyinaccurate 5 days ago

I think the parent comment means "commits" in the sense of the actual changeset; not just the message.

anshumankmr 5 days ago | parent [-]

That is also problematic, cause a git diff will probably require an exponential gain in context length AND also the ability for the LLM to use said context effectively.

That being said, a context length problem could be potentially be solved but it will take a bit of time, I think Llama4 had 10M context length (not sure if anyone tried prompting it with that much data to see how effective it really is)

tayo42 5 days ago | parent [-]

Do all of the diffs need to be included? Can't you include like a summarized version of a few changes?

Like I don't memorize the last 20 commits, but I know generally the direction things are going by reading those commits at some point

anshumankmr 5 days ago | parent [-]

If a commit was done a year or so back, then 20 commits would probably prove insufficient, and if say a team member is supposed to use some existing helper method already present in the codebase, which is easier to tell a person to use instead of an LLM writing another function to perform that same operation which is inefficient.

And even if you juiced up a context length of an LLM to astronomical numbers AND made it somehow better at parsing and understanding its context, it will not always repeat said capabilities in other codebases (see for example o3 supposedly being the top of most benchmarks but it will still fumble a simple variation mother-is-a-surgeon puzzle).

I am not saying its impossible for a company to figure this out, but it will be incredibly hard.