Remix.run Logo
spicyusername 2 days ago

I guess we're just going to be in the age of this conversation topic until everyone gets tired of talking about it.

Every one of these discussions boils down to the following:

- LLMs are not good at writing code on their own unless it's extremely simple or boilerplate

- LLMs can be good at helping you debug existing code

- LLMs can be good at brainstorming solutions to new problems

- The code that is written by LLMs always needs to be heavily monitored for correctness, style, and design, and then typically edited down, often to at least half its original size

- LLMs utility is high enough that it is now going to be a standard tool in the toolbox of every software engineer, but it is definitely not replacing anyone at current capability.

- New software engineers are going to suffer the most because they know how to edit the responses the least, but this was true when they wrote their own code with stack overflow.

- At senior level, sometimes using LLMs is going to save you a ton of time and sometimes it's going to waste your time. Net-net, it's probably positive, but there are definitely some horrible days where you spend too long going back and forth, when you should have just tried to solve the problem yourself.

rafaelmn 2 days ago | parent | next [-]

> but this was true when they wrote their own code with stack overflow.

Searching for solutions and integrating examples found requires effort that develops into a skill. You would rarely get solutions that would just fit into the codebase from SO. If I give a task to you and you produce a correct solution on the initial review I now know I can trust you to deal with this kind of problem in the future. Especially after a few reviews.

If you just vibed through the problem the LLM might have given you the correct solution - but there is no guarantee that it will do it again in the future. Just because you spent less effort on search/official docs/integration into the codebase you learned less about everything surrounding it.

So using LLMs as a junior you are just breaking my trust, and we both know you are not a competent reviewer of LLM code - why am I even dealing with you when I'll get LLM outputs faster myself ? This was my experience so far.

OvbiousError 2 days ago | parent | next [-]

> So using LLMs as a junior you are just breaking my trust, and we both know you are not a competent reviewer of LLM code - why am I even dealing with you when I'll get LLM outputs faster myself ? This was my experience so far.

So much this. I see a 1000 lines super complicated PR that was whipped up in less than a day and I know they didn't read all of it, let alone understand.

theshrike79 a day ago | parent [-]

This is exactly what code reviews are for, doing reviews with juniors is always time-consuming

rafaelmn 16 hours ago | parent [-]

Yes but your supposed to reap the benefit of learning from that junior. If there is no progress and no trust in ability acquired they are just a burden.

fhd2 2 days ago | parent | prev | next [-]

Like with any kind of learning, without a feedback loop (as tight as possible IMHO), it's not gonna happen. And there is always some kind of feedback loop.

Ultra short cycle: Pairing with a senior, solid manual and automated testing during development.

Reasonably short cycle: Code review by a senior within hours and for small subsets of the work ideally, QA testing by a seperate person within hours.

Borderline too long cycle: Code review of larger chunks of code by a senior with days of delay, QA testing by a seperate person days or weeks after implementation.

Terminally long feedback cycle: Critical bug in production, data loss, negative career consequences.

I'm confident that juniors will still learn, eventually. Seniors can help them learn a whole lot faster though, if both sides want that, and if the organisation lets them. And yeah, that's even more the case than in the pre LLM world.

DenisM 2 days ago | parent [-]

LLM can also help learning f you ask it what can be done better. Seniors can make prepromt so that company customs are taken into account.

dchftcs a day ago | parent | prev | next [-]

People obviously turning out LLM code uncritically should be investigated and depending on the findings made redundant. It's a good thing that it allows teams to filter out these people earlier. In my career I have found that a big predictor of the quality of code is the amount of thought put into it and the brains put behind it - procedures like code review can only catch so many things, and a senior's time is saved only when a junior has actually put their brain to work. If someone is shown to never put in thought in their work, they need to make way for people who actually do.

godelski a day ago | parent | prev [-]

  > you learned less about everything surrounding it.
I think one of the big acceleration points in my skills as a developer was when I moved from searching SO and other similar sources to reading the docs and reading the code. At first, this was much slower. I was usually looking for a more specific thing and didn't usually need the surrounding context. But then as I continued, that surrounding context became important. That stuff I was reading compounded and helped me see much more. These gains were completely invisible and sometimes even looked like losses. In reality, that context was always important, I just wasn't skilled enough to understand why. Those "losses" are more akin to a loss you have when you make an investment. You lost money, but gained a stock.

I mean I still use SO, medium articles, LLMs, and tons of sources. But I find myself just turning to the docs as my first choice now. At worst I get better questions to pay attention to with the other sources.

I think there's this terrible belief that's developed in CS and the LLM crowd targets. The idea that everything is simple. There's truth to this, but there's a lot of complexity to simplicity. The defining characteristic between an expert and a novice is their knowledge of nuance. The expert knows what nuances matter and what don't. Sometimes a small issue compounds and turns into a large one, sometimes it disappears. The junior can't tell the difference, but the expert can. Unfortunately, this can sound like bikeshedding and quibbling over nothings (sometimes it is). But only experts can tell the difference ¯\_(ツ)_/¯

samurai_sword 15 hours ago | parent [-]

You are absolutely right. I work as Robotics Engineer at autonomous company. I use cursor and currently using gpt-5-high for coding. When I started out coding for my project 3 years ago there was no AI coding. I had to learn how to code by reading lots of docs and reading lots & lots of code(nav2 stack). This gave me the sense of how code is architected, why it is the way it is, etc. I also try to not blindly follow any code I see but every single piece of code I critically ask lots of questions(this made me crazy, good kind). This helped me to learn extremely fast. So the point is "everyone must know when their brain is being used and when not. If your brain is not being used at anytime of a project then you are probably out of loop".

The thing about AI is when it started out(coding models) they were kinda bad. But I feel any tool that provides value to time or effort is a useful tool. I use AI now mostly to add some methods, ask questions about the code base and brainstorm ideas against that code base. There are levels on how you use this tool(AI).

1. Complete trust(if its easy task and you can verify quickly). 2. medium trust( you ask questions back to AI to critically understand why it did what it did). 3. zero trust.(this is very important for learning fast, not coding. You need to stress AI to give me lots of information, right or wrong, cross-check it manually and soak it in your brain carefully. Here you will know whether that AI is good or bad.)

Conclusion: We are human beings. Any tool must be used with caution, especially AI that is capable of playing tricks with your precious brain. Build razor sharp instincts and trust them ONLY.

chamomeal 2 days ago | parent | prev | next [-]

Yeah every time I see one of these articles posted on HN I know I'll see a bunch of comments like "well here's how I use claude code: I keep it on a tight leash and have short feedback loops, so that I'm still the driver, and have markdown files that explain the style I'm going for...". Which is fine lol but I'm tired of seeing the exact same conversations.

It's exhausting to hear about AI all the time but it's fun to watch history happen. In a decade we'll look back at all these convos and remember how wild of a time it was to be a programmer.

coldpie 2 days ago | parent | next [-]

I'm thiiiis close to writing a Firefox extension that auto-hides any HN headline with an LLM/AI-related keyword in the title, just so I can find something interesting on here again.

codyb 2 days ago | parent | next [-]

It comes and goes in cycles... I remember the hay days of MVC frameworks and oh my this one is MVVC! ad nauseum for... years lol.

I stopped coming here for a year or two, now I visit once a day or so and mostly just skim a couple threads.

Eventually, this entire field... just starts to feel pretty cyclical.

oblio a day ago | parent [-]

Keep in mind that the cycles here are not circles, they're spirals. We <<are>> progressing, it's just very slow to notice.

A bit of proof: https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...

1. Do you use source control? - I haven't seen a software company without source control in... 2 decades?

2. Can you make a build in one step? - This is a bit tricky, but it super widespread. Maybe not universal, but very widespread.

3. Do you make daily builds? - Same as #2.

4. Do you have a bug database? - Same as #2.

5. Do you fix bugs before writing new code? - This is a debatable topic but you could argue that modern bugs are more complex and we are fixing them.

6. Do you have an up-to-date schedule? - Heh, some things you just can't win :-p

7. Do you have a spec? - Similar to #6.

8. Do programmers have quiet working conditions? - This one is the biggest modern failure, but it's not one of tech.

9. Do you use the best tools money can buy? - Similar to #2.

10. Do you have testers? - We've moved to automated testing. We've lost some flair but we've gained a lot in day-to-day quality.

11. Do new candidates write code during their interview? - Really widespread now, but not universal. Less widespread than having proper build systems.

12. Do you do hallway usability testing? - Varies a lot by field, it used to vary even back in the day.

dysoco 2 days ago | parent | prev | next [-]

I appreciate HN staying simple but a tag system like lobsters has would be pretty nice...

icdtea 2 days ago | parent | prev | next [-]

You can do this with a custom filter list in Ublock Origin, no custom extension necessary.

coldpie 2 days ago | parent [-]

I'm thinking something that would actually use HN's "Hide" feature, so other stories will populate the page after the AI ones are hidden. Is that something uBO could do?

icdtea 2 days ago | parent [-]

You make a good point, I don't believe so. I currently block most LLM/AI threads and some pages can get quite sparse. Would love to check that out if you get around to putting that together!

ftkftk 2 days ago | parent | prev [-]

You could use an agentic AI coding tool to vibe code you one in minutes! /s

coldpie 2 days ago | parent [-]

Honestly might give that a go, yeah. Brand new, low stakes, throwaway projects are one of the few things these tools are actually genuinely pretty useful for.

ftkftk 2 days ago | parent [-]

I agree. It may not be the most environmentally sensitive approach but throwaway one time tooling is a perfect use case.

red-iron-pine 2 days ago | parent | prev | next [-]

> Which is fine lol but I'm tired of seeing the exact same conversations.

makes me think the bots are providing these conversations

ezekg 2 days ago | parent [-]

We're definitely moving full force into a dead internet [0], I think.

[0]: https://en.m.wikipedia.org/wiki/Dead_Internet_theory

spaceman_2020 2 days ago | parent [-]

LinkedIn and Twitter are the canaries. Text based content is already so easy to produce at scale that there’s no utility in creating it manually

theshrike79 a day ago | parent | prev [-]

It's fun to see Vibe Coders discover the basics of project management in real time without ever reading a book :)

That's what programming with LLMs is, it's just project management: You split the tasks into manageable chunks (ones that can be completed in a single context window), you need to have good onboarding documentation (CLAUDE.md or the equivalent) and good easy to access documentation (docs/ with markdown files).

Exactly what you use to manage a team of actual human programmers.

automatic6131 2 days ago | parent | prev | next [-]

>- LLMs utility is high enough that it is now going to be a standard tool in the toolbox of every software engineer, but it is definitely not replacing anyone at current capability.

Right! Problem, billions of dollars have been poured into this wrt to infrastructure, datacenters, compute and salaries. LLMs need to be at the level of replacing vast swathes of us to be worth it. LLMs are not going to be doing that.

This is a collosal malinvestment.

utyop22 2 days ago | parent [-]

Yeah eventually reality and fantasy have to converge.

Nobody knows when. But it will. TBH the biggest danger is that all the hopes and dreams aren't materialised and the appetite for high-risk investments dissipates.

We've had this period in which you can be money losing and its OK. But I believe we have passed the peak on that - and this is destined to blow up.

dawnerd 2 days ago | parent | prev | next [-]

On your last point I’ve found it about a wash in terms of time savings for me. For boiler plate / throw away code it’s decent enough - especially if I don’t care about code quality and only want a result.

It’s wasted so much time trying to make it write actual production quality code. The consistency and over-verbose nature kill it for me.

sunir 2 days ago | parent | prev | next [-]

All true if you one shot the code.

If you have a sophisticated agent system that uses multiple forward and backward passes, the quality improves tremendously.

Based on my set up as of today, I’d imagine by sometime next year that will be normal and then the conversation will be very different; mostly around cost control. I wouldn’t be surprised if there is a break out popular agent control flow language by next year as well.

The net is that unsupervised AI engineering isn’t really cheaper better or faster than human engineering right now. Does that mean in two years it will be? Possibly.

There will be a lot of optimizations in the message traffic, token uses, foundational models, and also just the Moore’s law of the hardware and energy costs.

But really it’s the sophistication of the agent systems that control quality more than anything. Simply following waterfall (I know, right? Yuck… but it worked) increased code quality tremoundously.

I also gave it the SelfDocumentingCode pattern language that I wrote (on WikiWikiWeb) as a code review agent and quality improved tremendously again.

theshrike79 2 days ago | parent | next [-]

> Based on my set up as of today, I’d imagine by sometime next year that will be normal and then the conversation will be very different; mostly around cost control. I wouldn’t be surprised if there is a break out popular agent control flow language by next year as well.

Currently it's just VC funded. The $20 packages they're selling are in no way cost-effective (for them).

That's why I'm driving all available models like I stole them, building every tool I can think of before they start charging actual money again.

By then local models will most likely be at a "good enough" level especially when combined with MCPs and tool use so I don't need to pay per token for APIs except for special cases.

tempoponet 2 days ago | parent [-]

Once local models are good enough there will be a $20 cloud provider that can give you more context, parameters, and t/s than you could dream of at home. This is true today with services like groq.

theshrike79 a day ago | parent | next [-]

Anthropic used to have unlimited subscriptions, then people started running angents 24/7.

Now they have 5 hour buckets of limited use.

Groq most likely stays afloat because they're a bit player - and propped by VC money.

With a local system I can run it at full blast all the time, nobody can suddenly make it stupid by reallocating resources to training their new model, nobody can censor it or do stealth updates that make it perform worse.

sunir 2 days ago | parent | prev | next [-]

Not exactly. Those models are based on intermittent usage. If you're using an AI engineer using a sophisticated agent flow, the usage is constant and continuous. That can price to an equivalent of a dedicated cube at home over 2 years.

I had 3 projects running today. I hit my Claude Max Pro session limits twice today in about 90 minutes. I'm now keeping it down to 1 project, and I may interrupt it until the evening when I don't need Claude Web. If I could run it passively on my laptop, I would.

hatefulmoron 2 days ago | parent | prev [-]

Groq and Cerebras definitely have the t/s, but their hardware is tremendously expensive, even compared to the standard data center GPUs. Worth keeping in mind if we're talking about a $20 subscription.

zarzavat 2 days ago | parent | prev | next [-]

> If you have a sophisticated agent system that uses multiple forward and backward passes, the quality improves tremendously.

Just an hour ago I asked Claude to find bugs in a function and it found 1 real bug and 6 hallucinated bugs.

One of the "bugs" it wanted to "fix" was to revert a change that I had made previously to fix a bug in code it had written.

I just don't understand how people burning tokens on sophisticated multi-agent systems are getting any value from that. These LLMs don't know when they are doing something wrong, and throwing more money at the problem won't make them any smarter. It's like trying to build Einstein by hiring more and more schoolkids.

Don't get me wrong, Claude is a fantastic productivity boost but letting it run around unsupervised would slow me down rather than speed me up.

oblio a day ago | parent | prev [-]

> and also just the Moore’s law of the hardware and energy costs.

What Moore's law?

MontyCarloHall 2 days ago | parent | prev | next [-]

It's almost as if you could recapitulate each of these discussions using an LLM!

theshrike79 2 days ago | parent | prev | next [-]

> - The code that is written by LLMs always needs to be heavily monitored for correctness, style, and design, and then typically edited down, often to at least half its original size

For this language matters a lot, if whatever you're using has robust tools for linting and style checks, it makes the LLMs job a lot easier. Give it a rule (or a forced hook) to always run tests and linters before claiming a job is done and it'll iterate until what it produces matches the rules.

But LLM code has a habit of being very verbose and covers every situation no matter how minuscule.

This is especially grating when you're doing a simple project for local use and it's bootstrapping something that's enterprise-ready :D

WorldMaker 2 days ago | parent [-]

If you force the LLM to solve every test failure this also can lead to the same breakdown models as very junior developers coding to the tests rather than the problems, I've seen all of:

1) I broke the tests, guess I should delete them.

2) I broke the tests, guess the code I wrote was wrong, guess I should delete all of that code I wrote.

3) I broke the tests, guess I should keep adding more code and scaffolding. Another abstraction layer might work? What if I just add skeleton code randomly, does this add random code whack-a-mole work?

That last one can be particularly "fun" because already verbose LLM code skyrockets into baroque million line PRs when left truly unsupervised, and that PR still won't build or pass tests.

There's no true understanding by an LLM. Forcing it to lint/build can be important/useful, but still not a cure-all, and leads to such fun even more degenerate cases than hand-holding it.

godelski a day ago | parent | prev | next [-]

FWIW, I think this summary is pretty in line with most "anti-LLM" crowd. Being in that "side" myself it is not that I don't use LLMs it is that I do not think LLMs are close to being able to replace me.

I also think there's some big variance in each of the "sides" (I think it is more a bimodal spectrum really) with a lot to you last point. Sometimes they save you lots of time, sometimes they waste a lot of time. I expect more senior people are going to get less benefits from them because they've already spent lots of time developing time saving strategies. Plus, writing lines is only a small part of the job. The planning and debugging stages are much more time intensive and can be much more difficult to wrangle an LLM with. Honestly I think it is a lot about trust. Forgetting "speed", do I trust myself to be more likely to catch errors in code that I write or code that I review?

Personally, I find that most of the time I end up arguing with the LLM over some critical detail and I've found Claude code will sometimes revert things that I asked it to change (these can be time consuming errors because they are often invisible). It gives the appearance of being productive (even feeling that way) but I think it is a lot more like if you spent time in a meeting vs time coding. Meetings can help and are very time consuming, but can also be a big waste of time when over used. Sometimes it is better to have two engineers go try out their methods independently and see what works out within the larger scope. Something is always learned too.

lordgrenville 2 days ago | parent | prev | next [-]

Yes! Reminds me of one of my all-time favourite HN comments https://news.ycombinator.com/item?id=23003595

2 days ago | parent | prev | next [-]
[deleted]
specialist 2 days ago | parent | prev | next [-]

> ...until everyone gets tired of talking about [LLMs]

Small price to pay for shuffling Agile Manifesto off the stage.

dboreham 2 days ago | parent | prev [-]

My experience with the latest Claude Code has been: it's not nearly as bad as you say.