Remix.run Logo
generalizations 7 days ago

These LLM discussions really need everyone to mention what LLM they're actually using.

> AI is awesome for coding! [Opus 4]

> No AI sucks for coding and it messed everything up! [4o]

Would really clear the air. People seem to be evaluating the dumbest models (apparently because they don't know any better?) and then deciding the whole AI thing just doesn't work.

stackbutterflow 6 days ago | parent | next [-]

Don't expect any improvement ever.

It happens on many topics related to software engineering.

The web developer is replying to the embedded developer who is replying to the architect-that-doesnt-code who is replying to someone with 2 years of experience who is replying to someone working at google who is replying to someone working at a midsize b2b German company with 4 customers. And on and on.

Context is always omitted and we're all talking about different things ignoring the day to day reality of our interlocutors.

bagacrap 6 days ago | parent | prev | next [-]

My experience is that AI enthusiasts will always say, "well you just used the wrong model". And when no existing model works well, they say, "well in 6 months it will work". The utility of agentic coding for complex projects is apparently unfalsifiable.

troupo 7 days ago | parent | prev | next [-]

> These LLM discussions really need everyone to mention what LLM they're actually using.

They need to mention significantly more than that: https://dmitriid.com/everything-around-llms-is-still-magical...

--- start quote ---

Do we know which projects people work on? No

Do we know which codebases (greenfield, mature, proprietary etc.) people work on? No

Do we know the level of expertise the people have? No.

Is the expertise in the same domain, codebase, language that they apply LLMs to? We don't know.

How much additional work did they have reviewing, fixing, deploying, finishing etc.? We don't know.

--- end quote ---

And that's just the tip of the iceberg. And that is an iceberg before we hit another one: that we're trying to blindly reverse engineer a non-deterministic blackbox inside a provider's blackbox

taormina 7 days ago | parent | prev | next [-]

I've used a wide variety of the "best" models, and I've mostly settled on Opus 4 and Sonnet 4 with Claude Code, but they don't ever actually get better. Grok 3-4 and GPT4 were worse, but like, at a certain point you don't get brownie points for not tripping over how low the bar is set.

generalizations 6 days ago | parent [-]

People have actually been basing their assertions on 4o. The bar is really low and people are still completely missing it.

omnicognate 7 days ago | parent | prev | next [-]

What the article says is as true of Opus 4 as any other LLM.

energy123 6 days ago | parent | prev [-]

> AI is exceptional for coding! [high-compute scaffold around multiple instances / undisclosed IOI model / AlphaEvolve]

> AI is awesome for coding! [Gpt-5 Pro]

> AI is somewhat awesome for coding! ["gpt-5" with verbosity "high" and effort "high"]

> AI is a pretty good at coding! [ChatGPT 5 Thinking through a Pro subscription with Juice of 128]

> AI is mediocre at coding! [ChatGPT 5 Thinking through a Plus subscription with a Juice of 64]

> AI sucks at coding! [ChatGPT 5 auto routing]