Remix.run Logo
jdlshore 5 hours ago

Carson’s experience matches mine: AI is good at analysis and boilerplate, but not good at the kind of critical thinking necessary for good designs. If it were human, I would say that it jumps to solutions to quickly, rather than stepping back to consider the big picture and how everything should fit together to make a cohesive whole.

It’s not human, of course, and I think this problem actually relates to the fact that LLMs don’t have a world model. They don’t study and think through a design in the way that humans do. They don’t form a mental model of how everything fits together and how that design can be tweaked to most elegantly support a change.

I suspect that this is a fundamental limitation of LLMs, and that design will remain a weak point until some sort of bespoke design AI is bolted onto the side. In the meantime, we’ve got a lot of people producing a lot of code very quickly, and I think the debt in that code is going to be a millstone around our necks for a long time to come.

recroad 4 minutes ago | parent | next [-]

Have to disagree with this as it's excellent at helping you wide and broad before converging. I suggest trying OpenSpec and use /ospx:explore to state your problem and go from there.

rst 2 hours ago | parent | prev | next [-]

One partial mitigation is to ask it to use plan mode -- and then very carefully review the plan before allowing it to execute.

bob1029 35 minutes ago | parent | next [-]

I've been in a lot of situations where I could step gpt5.x through a big refactor if I spoon feed it one type name at a time. If I let it try to do the whole thing at once it will refuse or get stuck in apply patch loops.

Planner / executor separation can make a huge difference in performance. LLMs are fantastic at coming up with a lot of elaborate narratives regarding what should be done. They are terrible about doing that prescribed work all at once. This impedance mismatch is best resolved with a simple role separation. Placing a shared collection of tasks between these roles is how you can decouple them. The executors need significantly more tokens than your planners to get the job done. It's probably in the range of 10-100x more for really complicated jobs with a lot of iterations through compiler feedback, sql provider errors, etc. This is why you can't do both things in the same context very well.

saagarjha 2 hours ago | parent | prev [-]

At that point I would rather just write the plan myself

oulipo2 3 hours ago | parent | prev | next [-]

Exactly, LLM is good at "code inpainting" : define clear structures and goals, and it will fill the boilerplate. But it doesn't work for reasoning and abstraction, so it fails to synthesise and propose novel views. But that's integral to the way it's designed and has been trained, to do a kind of "averaging" which limits it's capacity to explore novel designs

thunky an hour ago | parent [-]

> But it doesn't work for reasoning and abstraction, so it fails to synthesise and propose novel views

I disagree. Have a conversation with it about your problem and work through design decisions with it. When I do that, I find it gives me a lot of good ideas.

Disclaimer: I'm not working on anything groundbreaking (like most people)

vb-8448 3 hours ago | parent | prev [-]

It's just because not enough people had this very specific problem before.

This article will be part of the next model training set, and probably it will be able to solve it despite not understanding anything about world or not studying or thinking.