Was more of a general comment - I'm surprised there is significant variation between any of the frontier models?

However, vscode with various python frameworks/libraries; dash, fastapi, pandas, etc. Typically passing the 4-5 relevant files in as context.

Developing via docker so I haven't found a nice way for agents to work.

▲

fragmede 5 months ago | parent | next [-]

> I'm surprised there is significant variation between any of the frontier models?

This comment of mine is a bit dated, but even the same model can have significant variation if you change the prompt by just a few words.

https://news.ycombinator.com/item?id=42506554

▲

danielbln 5 months ago | parent | prev [-]

I would suggest using an agentic system like Cline, so that the LLM can wander through the codebase by itself and do research and build a "mental model" and then set up an implementation plan. The you iterate in that and hand it off for implementation. This flow works significantly better than what you're describing.

▲

otabdeveloper4 5 months ago | parent [-]

> LLM can wander through the codebase by itself and do research and build a "mental model"

It can't really do that due to context length limitations.

▲

exe34 5 months ago | parent | next [-]

It doesn't need the entire codebase, it just needs the call map, the function signatures, etc. It doesn't have to include everything in a call - but having access to all of it means it can pick what seems relevant.

▲

danielbln 5 months ago | parent | next [-]

Yes, that's exactly right. The LLM gets a rough overview over the project (as you said, including function signatures and such) and will then decide what to open and use to complete/implement the objective.

▲

otabdeveloper4 5 months ago | parent | prev [-]

In a real project the call map and function signatures are millions of tokens themselves.

▲

exe34 5 months ago | parent [-]

For sufficiently large values of real.

▲

otabdeveloper4 5 months ago | parent [-]

Anything less is not a "project", it's a "file".

▲

exe34 5 months ago | parent [-]

That's right, there is no true Scotsman!

▲

otabdeveloper4 5 months ago | parent [-]

Incorrect attempt as fallacy baiting.

If your repo map fits into 1000 tokens then your repo is small enough that you can just concatenate all the files together and feed the result as one prompt to the LLM.

No, current LLM technology does not allow to process actual (i.e. large) repos.

	▲	simonw 5 months ago \| parent [-]
		Where's your cutoff for "large"?

▲

johnisgood 5 months ago | parent | prev | next [-]

1k LOC is perfectly fine, I did not experience issues with Claude with most (not all) projects around ~1k LOC.

▲

otabdeveloper4 5 months ago | parent [-]

Actual projects where you'd want some LLM help start with millions of lines of code, not thousands.

With 1k lines of code you don't need an LLM, the entire source code can fit in one intern's head.

	▲	johnisgood 5 months ago \| parent \| next [-]
		The OP mentioned having LLM issues with 1k LOC, so I suppose he would have problems with millions. :D
	▲	simonw 5 months ago \| parent \| prev [-]
		Have you tried Claude Code yet? Even with it's 200,000 token limit it's still really impressive at diving through large codebases using find and grep.

▲

lukan 5 months ago | parent | prev [-]

I guess people are talking about different kinds of projects here in terms of project size.