Remix.run Logo
kvnhn 2 days ago

IMO, a key passage that's buried:

"You can ask the agent for advice on ways to improve your application, but be really careful; it loves to “improve” things, and is quick to suggest adding abstraction layers, etc. Every single idea it gives you will seem valid, and most of them will seem like things that you should really consider doing. RESIST THE URGE..."

A thousand times this. LLMs love to over-engineer things. I often wonder how much of this is attributable to the training data...

brookst 2 days ago | parent | next [-]

They’re not dissimilar to human devs, who also often feel the need to replat, refactor, over-generalize, etc.

The key thing in both cases, human and AI, is to be super clear about goals. Don’t say “how can this be improved”, say “what can we do to improve maintainability without major architectural changes” or “what changes would be required to scale to 100x volume” or whatever.

Open-ended, poorly-defined asks are bad news in any planning/execution based project.

strls 2 days ago | parent | next [-]

A senior programmer does not suggest adding more complexity/abstraction layers just to say something. An LLM absolutely does, every single time in my experience.

awesome_dude 2 days ago | parent [-]

You might not, but every "senior" programmer I have met on my journey has provided bad answers like the LLMs - and because of them I have an inbuilt verifier that means I check what's being proposed (by "seniors" or LLMs)

exitb 2 days ago | parent | prev | next [-]

There are however human developers that have built enough general and project-specific expertise to be able to answer these open-ended, poorly-defined requests. In fact, given how often that happens, maybe that’s at the core of what we’re being paid for.

brookst a day ago | parent | next [-]

But if the business doesn’t know the goals, is it really adding any value to go fulfill poorly defined requests like “make it better”?

AI tools can also take a swing at that kind of thing. But without a product/business intent it’s just shooting in the dark, whether human or AI.

awesome_dude 2 days ago | parent | prev [-]

I have to be honest, I've heard of these famed "10x" developers, but when I come close to one I only ever find "hacks" with a brittle understanding of a single architecture.

awesome_dude 2 days ago | parent | prev [-]

Most definitely, asking the LLM those things is the same as asking (people) on Reddit, Stack Overflow, IRC, or even Hacker News

iguessthislldo 2 days ago | parent | prev | next [-]

This is something I experienced first hand a few weeks ago when I first used Claude. I have this recursive-decent-based parser library I haven't touched in a few years that I want to continue developing but always procrastinate on. It has always been kinda slow so I wanted to see if Claude could improve the speed. It made very reasonable suggestions, the main one being caching parsing rules based on the leading token kind. It made code that looked fine and didn't break tests, but when I did a simple timed looped performance comparison, Claude's changes were slightly slower. Digging through the code, I discovered I already was caching rules in a similar way and forgot about it, so the slight performance loss was from doing this twice.

nosianu 2 days ago | parent [-]

Caching sounds fine, and it is a very potent method. Nevertheless, I avoid using it until I have almost no other options left, and no good ones. You now have to manage that cache, introduce a potential for hard to debug and rare runtime timing errors, and add a lot of complexity. For me, adding caching should come at the end, when the whole project is finished, you exhausted all your architecture options, and you still need more speed. And I'll add some big warnings, and pray I don't run into too many new issues introduced by caching.

It's better for things that are well isolated and definitely completely "inside the box" with no apparent way for the effects to have an effect outside the module, but you never know when you overlook something, or when some later refactoring leads to the originally sane and clean assumptions to be made invalid without anyone noticing, because whoever does the refactoring only looks at a sub-section of the code. So it is not just a question of getting it right for the current system, but to anticipate that anything that can go wrong might actually indeed go wrong, if I leave enough opportunities (complexity) even in right now well-encapsulated modules.

I mean, it's like having more than one database and you have to use both and keep them in sync. Who does that voluntarily? There's already caching inside many of the lower levels, from SSDs, CPUs, to the OS, and it's complex enough already, and can lead to unexpected behavior. Adding even more of that in the app itself does not appeal to me, if I can help it. I'm just way too stupid for all this complexity, I need it nice and simple. Well, as nice and simple as it gets these days, we seem to be moving towards biological system level complexity in larger IT systems.

If you are not writing the end-system but a library, there is also the possibility that the actual system will do its own caching on a higher level. I would carefully evaluate if there is really a need to do any caching inside my library? Depending on how it is used, the higher level doing it too would likely make that obsolete because the library functions will not be called as often as predicted in the first place.

There is also that you need a very different focus and mindset for the caching code, compared to the code doing the actual work. For caching, you look at very different things than what you think about for the algorithm. For the app you think on a higher level, how to get work done, and for caching you go down into the oily and dirty gear boxes of the machine and check all the shafts and gears and connections. Ideally caching would not be part of the business code at all, but it is hard to avoid and the result is messy, very different kinds of code, dealing with very different problems, close together or even intertwined.

2 days ago | parent | prev | next [-]
[deleted]
nativeit 2 days ago | parent | prev [-]

> I often wonder how much of this is attributable to the training data...

I'd reckon anywhere between 99.9%-100%. Give or take.