Remix.run Logo
micw 4 hours ago

For me, AI is an enabler for things you can't do otherwise (or that would take many weeks of learning). But you still need to know how to do things properly in general, otherwise the results are bad.

E.g. I'm a software architect and developer for many years. So I know already how to build software but I'm not familiar with every language or framework. AI enabled me to write other kind of software I never learned or had time for. E.g. I recently re-implemented an android widget that has not been updated for a decade by it's original author. Or I fixed a bug in a linux scanner driver. None of these I could have done properly (within an acceptable time frame) without AI. But also none of there I could have done properly without my knowledge and experience, even with AI.

Same for daily tasks at work. AI makes me faster here, but also makes me doing more. Implement tests for all edge cases? Sure, always, I saved the time before. More code reviews. More documentation. Better quality in the same (always limited) time.

mirsadm 3 hours ago | parent | next [-]

I use Claude Code a lot but one thing that really made me concerned was when I asked it about some ideas I have had which I am very familiar with. It's response was to constantly steer me away from what I wanted to do towards something else which was fine but a mediocre way to do things. It made me question how many times I've let it go off and do stuff without checking it thoroughly.

physicsguy 3 hours ago | parent | next [-]

I've had quite a bit of the "tell it to do something in a certain way", it does that at first, then a few messages of corrections and pointers, it forgets that constraint.

embedding-shape 14 minutes ago | parent [-]

> it does that at first, then a few messages of corrections and pointers, it forgets that constraint.

Yup, most models suffer from this. Everyone is raving about million tokens context, but none of the models can actually get past 20% of that and still give as high quality responses as the very first message.

My whole workflow right now is basically composing prompts out of the agent, let them run with it and if something is wrong, restart the conversation from 0 with a rewritten prompt. None of that "No, what I meant was ..." but instead rewrite it so the agent essentially solves it without having to do back and forth, just because of this issue that you mention.

Seems to happen in Codex, Claude Code, Qwen Coder and Gemini CLI as far as I've tested.

ozlikethewizard 3 hours ago | parent | prev | next [-]

Call me a conspiracy theorist, and granted much of this could be attributed to the fact that the majority of code in existence is shit, but im convinced that these models are trained and encouraged to produce code that is difficult for humans to work on. Further driving and cementing the usage of then when you inevitably have to come back and fix it.

exceptione 3 hours ago | parent | next [-]

I don't think they would be able to have an LLM withouth the flaws. The problem is that an LLM cannot make a distinction between sense and nonsense in the logical way. If you train an LLM on a lot of sensible material, it will try to reproduce it by matching training material context and prompt context. The system does not work on the basis of logical principles, but it can sound intelligent.

I think LLM producers can improve their models by quite a margin if customers train the LLM for free, meaning: if people correct the LLM, the companies can use the session context + feedback to as training. This enables more convincing responses for finer nuances of context, but it still does not work on logical principles.

LLM interaction with customers might become the real learning phase. This doesn't bode well for players late in the game.

trcf23 3 hours ago | parent | prev | next [-]

Or it takes a lot of time effort and intelligence to produce good code and IA is not there yet…

CatMustard 3 hours ago | parent | prev | next [-]

This could be the case even without an intentional conspiracy. It's harder to give negative feedback to poor quality code that's complicated vs. poor quality code that's simple.

Hence the feedback these models get could theoretically funnel them to unnecessarily complicated solutions.

No clue has any research been done into this, just a thought OTTOMH.

Perz1val 2 hours ago | parent | prev [-]

It is a mathematical, averaging model after all

xgb84j 3 hours ago | parent | prev [-]

Mediocre is fine for many tasks. What makes a good software engineer is that he spots the few places in every software where mediocre is not good enough.

bonoboTP 2 hours ago | parent | prev | next [-]

Yes but in my experience this sometimes works great, other times you paint yourself in a corner and the sun total is that you still have to learn the thing, just the initial ram is less steep. For example I build my self a nice pipeline for converting jpegs on disk to h264 on disk via zero-copy nvjpeg to nvenc, with python bindings but have been pulling out my hair over bframe ordering and weird delays in playback etc. Nothing u solvable but I had to learn a great deal and when we were in the weeds, Opus was suggesting stupid hack quick fixes that made a whack a mole with the tests. In the end I had to lead e Pugh and read enough to be able to ask it with the right vocabulary to make it work. Similarly with entering many novel areas. Initially I get a rush because it "just works" but it really only works for the median case initially and it's up to you to even know what to test. And AIs can be quite dismissive of edge cases like saying this will not happen in most cases so we can skip it etc.

bandrami 32 minutes ago | parent | prev | next [-]

Huh. I'm extremely skeptical of AI in areas where I don't have expertise, because in areas where I do have expertise I see how much it gets wrong. So it's fine for me to use it in those areas because I can catch the errors, but I can't catch errors in fields I don't have any domain expertise in.

netdevphoenix an hour ago | parent | prev | next [-]

> Or I fixed a bug in a linux scanner driver. None of these I could have done properly (within an acceptable time frame) without AI. But also none of there I could have done properly without my knowledge and experience, even with AI

There are some things here that folks making statements like yours often omit and it makes me very sus about your (over)confidence. Mostly these statements talk in a business short-term results oriented mode without mentioning any introspective gains (see empirically supported understanding) or long-term gains (do you feel confident now in making further changes _without_ the AI now that you have gained new knowledge?).

1. Are you 100% sure your code changes didn't introduce unexpected bugs?

1a. If they did, would you be able to tell if they where behaviour bugs (ie. no crashing or exceptions thrown) without the AI?

2. Did you understand why the bug was happening without the AI giving you an explanation?

2a. If you didn't, did you empirically test the AI's explanation before applying the code change?

3. Has fixing the bug improve your understanding of the driver behaviour beyond what the AI told you?

3a. Have you independently verified your gained understanding or did you assume that your new views on its behaviour are axiomatically true?

Ultimately, there are 2 things here: one is understanding the code change (why it is needed, why that particular change implementation is better relative to others, what future improvements could be made to that change implementation in the future) and skill (has this experience boosted your OWN ability in this particular area? in other words, could you make further changes WITHOUT using the AI?).

This reminds me of people that get high and believe they have discovered these amazing truths. Because they FEEL it not because they have actual evidence. When asked to write down these amazing truths while high, all you get in the notes are meaningless words. While these assistants are more amenable to get empirically tested, I don't believe most of the AI hypers (including you in that category) are actually approaching this with the rigour that it entails. It is likely why people often think that none of you (people writing software for a living) are experienced in or qualified to understand and apply scientific principles to build software.

Arguably, AI hypers should lead with data not with anecdotal evidence. For all the grandiose claims, the lack of empirical data obtained under controlled conditions on this particular matter is conspicuous by its absence.

jacquesm 39 minutes ago | parent | next [-]

It's incredible that within two minutes after posting this comment is already grayed out whereas it makes a number of excellent points.

I've been playing with various AI tools and homebrew setups for a long time now and while I see the occasional advantage it isn't nearly as much of a revolution as I've been led to believe by a number of the ardent AI proponents here.

This is starting to get into 'true believer' territory: you get these two camps 'for and against' whereas the best way forward is to insist on data rather than anecdotes.

AI has served me well, no doubt about that. But it certainly isn't a passe-partout and the number of times it has caused gross waste of time because it insisted on chasing some rabbit simply because it was familiar with the rabbit adds up to a considerable loss in productivity.

The scientific principle is a very powerful tool in such situations and anybody insisting on it should be applauded. It separates fact from fiction and allows us to make impartial and non-emotional evaluations of both theories and technologies.

svara 14 minutes ago | parent [-]

> (...) you get these two camps 'for and against' whereas the best way forward is to insist on data rather than anecdotes.

I think that's an issue with online discussions. It barely happens to me in the real world, but it's huge on HN.

I'm overall very positive about AI, but I also try to be measured and balanced and learn how to use it properly. Yet here on HN, I always get the feeling people responding to me have decided I am a "true believer" and respond to the true believer persona in their head.

KptMarchewa 44 minutes ago | parent | prev [-]

Why would you ever, outside flight and medical software, care about being 100% sure that the change did not introduce any bugs?

jacquesm 39 minutes ago | parent | next [-]

Because bugs are bad. Fixing one bug but accidentally introducing three more is such a pattern it should have a name.

KptMarchewa 14 minutes ago | parent [-]

They are. And we have processes to minimize them - tests, code review, staging/preprod envs - but they are nowhere close to being 100% sure that code is bug free - that's just way too high bar for both AI and purely human workflows outside of few pretty niche fields.

jacquesm a few seconds ago | parent [-]

When you use AI to 'fix' something you don't actually understand the chances of this happening go up tremendously.

bandrami 35 minutes ago | parent | prev [-]

Because why would you make something broken when you could make something not broken?

KptMarchewa 14 minutes ago | parent [-]

Because it's way too high bar to be 100% sure outside of few niche fields.

ivell 3 hours ago | parent | prev | next [-]

In my case I built a video editing tool fully customized for a community of which I am a member. I could do it in a few hours. I wouldn't have even started this project as I don't have much free time, though I have been coding for 25+ years.

I see it empowering to build custom tooling which need not be a high quality maintenance project.

joshbee 3 hours ago | parent | prev | next [-]

I'm in the same boat. I've been taking on much more ambitious projects both at work and personally by collaborating with LLMs. There are many tasks that I know I could do myself but would require a ton of trial and error.

I've found giving the LLMs the input and output interfaces really help keep them on rails, while still being involved in the overall process without just blindly "vibe coding."

Having the AI also help with unit tests around business logic has been super helpful in addition to manual testing like normal. It feels like our overall velocity and code quality has been going up regardless of what some of these articles are saying.

rustyhancock an hour ago | parent [-]

100% agree with AI expanding core testing from my own edge and key tests.

I agree, I write out the sketch of what I want. With a recent embedded project in C I gave it a list of function signatures and high level description and was very satisfied with what it produced. It would have taken me days to nail down the particulars of the HAL (like what kind of sleep do I want what precisely is the way to setup the WDT and ports).

I think it's also language dependent.

I imagine JavaScript can be a crap shoot. The language is too forgiving.

Rust is where I have had most success. That is likely a personal skill issue, I know we want a Arc<DashMap>, will I remember all the foibles of accessing it? No.

But given the rigidity of the compiler and strong typing I can focus on what the code functionally is doing, that in happy with the shape/interface and function signature and the compiler is happy with the code.

It's quite fast work. It lets me use my high level skills without my lower level skills getting in the way.

And id rather rewrite the code at a mid-level then start it fresh, and agree with others once it's a large code base then in too far behind in understanding the overall system to easily work on it. That's true of human products too - someone elses code always gives me the ick.

joshbee an hour ago | parent [-]

Vanilla javascript is hit or miss for anything complex.

Using Typescript works great because you can still build out the interfaces and with IDE integrations the AIs can read the language server results so they get all the type hints.

I agree that the AI code is usually a pretty good starting point and gets me up to speed for new features fast rather than starting everything from scratch. I usually end up refactoring the last 10-20% manually to give it some polish because some of the code still feels off some times.

varjag 3 hours ago | parent | prev | next [-]

I think what we'll see as AI companies collect more usage data the requirements for knowing what you do will sink lower and lower. Whatever advantage we have now is transient.

viraptor 3 hours ago | parent | prev | next [-]

> But you still need to know how to do things properly in general, otherwise the results are bad.

Even that could use some nuance. I'm generating presentations in interactive JS. If they work, they work - that's the result, and I extremely don't care about the details for this use case. Nobody needs to maintain them, nobody cares about the source. There's no need for "properly" in this case.

kilninvar an hour ago | parent | prev | next [-]

I've found this is exact opposite of what I'd dare do with AI, things you don't understand are things you can't verify. Consider you want a windowed pane for your cool project, so you ask an AI to draft a design. It looks cool and it works! Until you bring it outside where after 30 minutes it turns into explosive shrapnel, because the model didn't understand thermal expansion, nor did you.

Contrast this to something you do know but can't be arsed to make; you can keep re-rolling a design until you get something you know and can confirm works. Perfect, time saved.

trcf23 3 hours ago | parent | prev [-]

Also most of the studies shown start to be obsolete with AI rapid path of improvements. Opus 4.5 has been a huge game changer for me (combined with CC that I had not used before) since December. Claude code arrived this summer if I’m not mistaken.

So I’m not sure a study from 2024 or impact on code produced during 2024 2025 can be used to judge current ai coding possibilities.

jacomoRodriguez 2 hours ago | parent [-]

Agreed, this space move so fast, 2024 feels like light-years away in terms of capabilities.