Remix.run Logo
bonesss 3 days ago

Past the sea change: half the reason those prompt and harness solutions seem to work are LLM-lies, the testing is gassing you about how it works and the efficacy, defaulting to ‘yes’.

If you test specific features of those solutions over time you see very inconsistent results, lots of lies, and seemingly stable solutions that one-shot well but suddenly experience behaviour changes due to tweaks on the backend. Tuesdays awesome agent stack that finally works is loading totally different on Thursday, and debugging is “oh, sorry, it’s better now” even when it isn’t. Compression, lies, and external hosting are a bad combo.

Sometimes I imagine a world where computers executed programs the same way each time. You could write some code once and run it a whole calendar month later with a predictable outcome. What a dream, we can hope I guess.

skydhash 3 days ago | parent [-]

People are doing toy projects and praising them, while some are testing them in real world situations and not findings them that useful. But the former is labelling the latter as luddites and telling them they will be left behind.

abustamam 3 days ago | parent [-]

As someone on the intersection of both (I've built a lot of vibe coded toy projects and lead a vibe coding initiative at work), they're both right and both wrong.

For a single dev team, vibe coding is great. Write specs, write plans, write code. I know what the project wants and needs because I'm the target market.

At work, I haven't written more than a few lines of code since December. But I work with other people vibe coding this same project. Lots of changing requirements and rapid iteration. Lots of mistakes were made by everyone involved. Lots of tech debt. Sure, we built something in 2 mos that would have otherwise taken us 6 mos, but now I'm fixing the mess that we caused.

I think the critical difference is the attitude towards our situation. My boss said to fix the AI harness so we can vibe code more confidently and freely. But other bosses might cut their losses and ban vibe coding. Who's right? I dunno. In both cases I'd just do what my boss wants me to do. But it's not that I don't want to be left behind. I don't want to lose my job. There's a difference.

patrick451 2 days ago | parent [-]

> Sure, we built something in 2 mos that would have otherwise taken us 6 mos, but now I'm fixing the mess that we caused.

You didn't actually build it in 2 months.

abustamam 2 days ago | parent [-]

Even if it takes me a month to get us to fix (likely a week tbh), then it took us 3 months to build.

herewulf 2 days ago | parent [-]

A mere 2x productivity improvement sounds like something you could achieve by introducing new tools that are predictable (i.e.: Not AI).

abustamam a day ago | parent [-]

Perhaps. 2x is still 2x. And new tools still need to be vetted and learned.

It's strange that the goalpost seems to have moved from "AI is net negative to productivity" to "only 2x improvement isn't worth it"