Remix.run Logo
Aurornis 5 hours ago

Writing detailed specs and then giving them to an AI is not the optimal way to work with AI.

That's vibecoding with an extra documentation step.

Also, Sonnet is not the model you'd want to use if you want to minimize cleanup. Use the best available model at the time if you want to attempt this, but even those won't vibecode everything perfectly for you. This is the reality of AI, but at least try to use the right model for the job.

> Therefore I need more time and effort with Gen AI than I needed before

Stop trying to use it as all-or-nothing. You can still make the decisions, call the shots, write code where AI doesn't help and then use AI to speed up parts where it does help.

That's how most non-junior engineers settle into using AI.

Ignore all of the LinkedIn and social media hype about prompting apps into existence.

EDIT: Replaced a reference to Opus and GPT-5.5 with "best available model at the time" because it was drawing a lot of low-effort arguments

wg0 4 hours ago | parent | next [-]

> Writing detailed specs and then giving them to an AI is not the optimal way to work with AI.

It is NOT the way to work with humans basically because most software engineers I worked with in my career were incredibly smart and were damn good at identifying edge cases and weird scenarios even when they were not told and the domain wasn't theirs to begin with. You didn't need to write lengthy several page long Jira tickets. Just a brief paragraph and that's it.

With AI, you need to spell everything out in detail. But that's NO guarantee either because these models are NOT deterministic in their output. Same prompt different output each time. That's why every chat box has that "Regenerate" button. So your output with even a correct and detailed prompt might not lead to correct output. You're just literally rolling a dice with a random number generator.

Lastly - no matter how smart and expensive the model is, the underlying working principles are the same as GPT-2. Same transformers with RL on top, same random seed, same list of probabilities of tokens and same temperature to select randomly one token to complete the output and feedback in again for the next token.

throwaway7783 2 hours ago | parent | next [-]

This is not true in my experience at all. I never write such detailed spec for AI - and that is my value as the human in the loop - to be iterative, to steer and make decisions. The AI in fact catches more edge cases than I do, and can point me to things that I never considered myself. Our productivity has increased manyfold, and code quality has increased significantly because writing tests is no longer a chore or an afterthought, or the biggest one for us - "test setup is too complicated". All of that is gone. And it is showing in a decrease in customer reported issues

snarkconjecture 3 hours ago | parent | prev | next [-]

> the underlying working principles are the same as GPT-2

I don't think anyone was claiming otherwise. Sonnet is still better at writing code than GPT-2, and worse than Opus. Workflows that work with Opus won't always work with Sonnet, just as you can't use GPT-2 in place of Sonnet to do code autocomplete.

jonas21 3 hours ago | parent | prev [-]

> That's why every chat box has that "Regenerate" button.

Wait, are you doing this in the web chat interface?!

That's definitely not a good way. You need to be using a harness (like Claude Code) where the agent can plan its work, explore the codebase, execute code, run tests, etc. With this sort of set up, your prompts can be short (like 1 to 5 sentences) and still get great results.

wg0 2 hours ago | parent [-]

I use claud CLI or OpenCode. The "Regenerate" example is just to illustrate that same prompt would produce different output each time. You're rolling a dice.

rafram 5 hours ago | parent | prev | next [-]

> Opus or GPT-5.5 are the only ways to even attempt this.

It’s pretty funny to claim that a model released 22 hours ago is the bare minimum requirement for AI-assisted programming. Of course the newest models are best at writing code, but GPT-* and Claude have written pretty decent systems for six months or so, and they’ve been good at individual snippets/edits for years.

Aurornis 5 hours ago | parent [-]

> It’s pretty funny to claim that a model released 22 hours ago is the bare minimum requirement for AI-assisted programming.

Not what I said.

The OP was trying to write specs and have an AI turn it into an app, then getting frustrated with the amount of cleanup.

If you want the AI to write code for you and minimize your cleanup work, you have to use the latest models available.

They won't be perfect, but they're going to produce better results than using second-tier models.

rafram 4 hours ago | parent [-]

Is it actually the case that 5.5 is that much better at implementing specs than its very capable predecessor released a month ago? Just seems like a baseless and silly claim about a model that has barely been out long enough for anyone to do serious work with it.

Aurornis 4 hours ago | parent | next [-]

> Is it actually the case that 5.5 is that much better at implementing specs than its very capable predecessor released a month ago?

The OP comment was talking about Claude Sonnet. I was comparing to that.

I should have just said "use the best model available"

ghurtado 4 hours ago | parent | prev [-]

> Is it actually the case that 5.5 is that much better

Nobody was talking about how much better it is until you wrote this though

It's like you're building your own windmills brick by brick

munk-a 5 hours ago | parent | prev | next [-]

> Stop trying to use it as all-or-nothing. You can still make the decisions, call the shots, write code where AI doesn't help and then use AI to speed up parts where it does help.

You're assuming that finding the places where AI needs help isn't already a larger task than just writing it yourself. AI can be helpful in development in very limited scenarios but the main thrust of the comment above yours is that it takes longer to read and understand code than to write it and AI tooling is currently focused on writing code.

We're optimizing the easy part at the expense of the difficult part - in many cases it simply isn't worth the trouble (cases where it is helpful, imo, exist when AI is helping with code comprehension but not new code production).

Aurornis 5 hours ago | parent | next [-]

> You're assuming that finding the places where AI needs help isn't already a larger task than just writing it yourself.

Not assuming anything, I'm well versed in how to do this.

Anyone who defers to having AI write massive blocks of code they don't understand is going to run into this.

You have to understand what you want and guide the AI to write it.

The AI types faster than me. I can have the idea and understand and then tell the LLM to rearrange the code or do the boring work faster than I can type it.

Exoristos 4 hours ago | parent | next [-]

The number of devs I've worked with who can't touch-type and don't use or know their way around a proper IDE is depressingly large.

Aurornis 4 hours ago | parent | next [-]

Same with debuggers. I run into people with 10 years of experience who are still trying to printf debug complex problems that would be easy with 5 minutes in a debugger.

I think we're seeing something similar with AI: There are devs who spend a couple days trying to get AI to magically write all of their code for them and then swear it off forever, thinking they're the only people who see the reality of AI and everyone else is wrong.

munk-a 38 minutes ago | parent [-]

At the same time - there are devs that spend two days setting up a debugger for a simple problem that would be easy with five minutes and printf. AI is a tool and it's a useful tool - it's not always the best tool for the job and the real skill is in knowing when you use it and when not to.

It's a sort of context of life that the easy problems are solved - those where an extreme answer is always correct are things we no longer even consider problems... most of the options that remain have their advantages and disadvantages so the true answer is somewhere in the middle.

throwuxiytayq 21 minutes ago | parent | prev [-]

This isn't about touch typing or IDE tricks. I'm an IDE power user and - reasoning aside - I used to run circles around my peers when it comes to raw code editing efficiency. This is increasingly an obsolete workflow. LLMs can execute codebase-wide refactors in seconds. You can use them as a (foot-)shotgun, or as a surgical tool.

ryan_n 4 hours ago | parent | prev | next [-]

You've come full circle and are essentially just describing what the OP was saying in their initial post lol.

kakacik 4 hours ago | parent | prev [-]

If you are trying to sell it, you are doing a poor job and effectively siding with OP while desperately trying to write the opposite.

Juniors are mostly better than what you write as behavior, I certainly never had to correct as much after any junior as OP writes. If you have 'boring code' in your codebase, maybe it signals not that great architecture (and I presume we don't speak about some codegens which existed since 90s at least).

Also, any senior worth their salt wants to intimately understand their code, the only way you can anyhow guarantee correctness. Man, I could go on and on and pick your statements one by one but that would take long.

_puk 4 hours ago | parent | prev [-]

The problem I have with this take is it's focused on solving the right now problem.

Yes, it's quicker to do it yourself this time, but if we build out the artifacts to do a good enough job this time, next time it'll have all the context it needs to take a good shot at it, and if you get overtaken by AI in the meantime you've got an insane head start.

Which side of history are you betting on?

munk-a 3 hours ago | parent [-]

I don't believe that investing more of my time in a slower process now would result in an advantage if that other process was refined. I've toyed around with these tools and know enough to get an environment up and running so what would I gain from using them more right now if those tools may significantly change before they're adapted to more efficient usage?

I'm okay not being at the bleeding edge - I can see the remains of the companies that aggressively switch to the new best thing. Sometimes it'll pay off and sometimes it won't. I am comfortable being a person that waits until something hits a 2.0 and the advantages and disadvantages are clear before seriously considering a migration.

afro88 4 hours ago | parent | prev | next [-]

> Writing detailed specs and then giving them to an AI is not the optimal way to work with AI. > That's vibecoding with an extra documentation step.

Read uncharitably, yeah. But you're making a big assumption that the writing of spec wasn't driven by the developer, checked by developer, adjusted by developer. Rewritten when incorrect, etc.

> You can still make the decisions, call the shots

One way to do this is to do the thinking yourself, tell it what you want it to do specifically and... get it to write a spec. You get to read what it thinks it needs to do, and then adjust or rewrite parts manually before handing off to an agent to implement. It depends on task size of course - if small or simple enough, no spec necessary.

It's a common pattern to hand off to a good instruction following model - and a fast one if possible. Gemini 3 Flash is very good at following a decent spec for example. But Sonnet is also fine.

> Stop trying to use it as all-or-nothing

Agree. Some things just aren't worth chasing at the moment. For example, in native mobile app development, it's still almost impossible to get accurate idiomatic UI that makes use of native components properly and adheres to HIG etc

mandeepj 5 hours ago | parent | prev | next [-]

Sure, Opus is next level than Sonnet, but it still doesn't free OP from these handcuffs - It is reading the code, understanding it and making a mental model that's way more labour intensive.

Aurornis 5 hours ago | parent | next [-]

The OP's problem was treating the situation as two extremes: Either write everything myself, or defer entirely to the AI and be forced to read it later.

I was trying to explain that this isn't how successful engineers use AI. There is a way to understand the code and what the AI is doing as you're working with it.

Writing a spec, submitting it to the AI (a second-tier model at that) and then being disappointed when it didn't do exactly what you wanted in a perfect way is a tired argument.

WesolyKubeczek 5 hours ago | parent | prev [-]

But when you write code by hand, you at least are there as it’s happening, which makes reading and understanding way easier.

elAhmo 5 hours ago | parent | prev | next [-]

Funny hearing you’re saying only GPT 5.5 (and Opus) can do this, having in mind that it came out last night.

Aurornis 5 hours ago | parent [-]

To be clear, I'm not saying that they can do this.

I'm saying that if you're trying to have AI write code for you and you want to do as little cleanup as possible, you have to use the best model available.

ForOldHack 3 hours ago | parent | prev [-]

"Writing detailed specs and then giving them to an AI is not the optimal way to work with AI." Perfect. I loosely define things, and then correct it, and tell it to make the corrections, and it gets trained, but you have to constantly watch it. Its like a glorified auto-typer.

"Ignore all of the LinkedIn and social media hype about prompting apps into existence." Absolutely, its not hype, its pure marketing bullshitzen.