I've hit this point with AI where it's not a simple process, but a long drawn out back and forth.

I'll use AI to design the implementation of a medium sized, cross cutting feature. Review all the details, maybe iterate on just that. Then implement with Claude 4.7 Max - which runs slower, but does a better job. Then review the implementation, then have Codex GPT 5.5 xhigh fast review it - which almost always finds corner cases. Have Claude fix those - Claude is better at writing intuitive maintainable code versus Codex overengineered/shortcut filled code. (Codex is better at finding/fixing bugs and doing reviews - it's annoyingly pedantic)

Then repeat with fresh Claude/Codex instances having them both review the current staged changes and getting feedback, handling the feedback. Then covering it in tests. I mean overall I still implement the feature faster than coding it manually, but I spend a majority of the time going back and forth with reviews, handling corner cases and at the finish end up with what I feel a really solid implementation of whatever feature I'm working on. The v1 feature feels more like a v3 given the amount of iteration it already went through.

▲

aomix an hour ago | parent | next [-]

Talking the problem to death with the AI before implementation is a nice zone for me. I feel productive, get good results out of the AI, and still largely understand the code. That’s the part of the AI revolution that I feel has made me a better engineer because I argue about design and architecture all day with a robot.

▲

mikepurvis 11 minutes ago | parent | next [-]

Despite the cynical sibling reply, I also feel like there's real value here. Contrary to the meme, I don't think Claude just tells me I'm brilliant, but really does push back on directions that are unproductive, identify when a part is overcomplicated or a dependency has become redundant, etc. Those are important things to have at least a sightline on before getting too deep into the code, even in a world where a lot of code can be created basically for free.

▲

qsera 43 minutes ago | parent | prev [-]

>I argue about design and architecture all day with a robot.

You will outgrow it at some point.

	▲	bartread 7 minutes ago \| parent \| next [-]
		I think this is OK though. We can still micromanage[0] the code generation part for a useful productivity boost, I think. [0] At least, in my experience, "micromanaging" the AI is what gives me the best results. Iterating on the initial design, then iterating on the plan, then reviewing the proposed code changes (including tests), then getting an independent code review from another LLM, etc. If you give an LLM too much latitude that's when the really shitty code and ill-considered breaking changes/obliteration of existing functionality starts to creep in.
	▲	Terretta 10 minutes ago \| parent \| prev \| next [-]
		Or learn something at some point. https://en.wikipedia.org/wiki/Rubber_duck_debugging
	▲	busterarm 11 minutes ago \| parent \| prev \| next [-]
		nullsanity's comment is dead and downvoted to oblivion but also incredibly underrated. I was more annoyed than anything that I didn't hit this moment until my 40s. Except it's not just reddit (I quit reddit 15 years ago). It's the whole internet.
	▲	nullsanity 22 minutes ago \| parent \| prev [-]
		[dead]

▲

scosman 2 hours ago | parent | prev | next [-]

yes exactly. Too many people ask AI to one-shot complex tasks, and wonder it behaves like a junior asked to rush something.

I have my own skill: 5 rounds of research/planning/test-planning. Interactive with me in loop for all important decisions. Starts with high level shape, then details. Planning can take 2-3 days of my time, then the implementation agent can take many hours (Opus 4.7). It splits the implementation across many phases/commits, each with its own code-review fix loop. Deep code review at the end can take another hour or two. It opens a PR, Gemini reviews, it reads out and resolves those issues.

Projects still take days or weeks, but 5x faster than doing it all myself.

Edit: the skill - https://github.com/scosman/vibe-crafting

▲

dawnerd 22 minutes ago | parent | next [-]

Even fully planned it’s still no better than a junior dev. You’re leaving out how much back and forth you have the ai do on itself, which you’d have on a junior dev too. In the end does it matter if it’s giving you what you want? Guess not really. But let’s not act like it’s crazy good when you’re still doing a lot of rounds of revisions on something an experienced dev would know to do right the first time.

▲

deadbabe 2 hours ago | parent | prev [-]

Does the 5x faster including shipping? Or just the work part?

IMO if you are not shipping out faster then the faster work gains are meaningless.

If you are shipping faster, you’re probably picking up more work and shipping everything too fast leading to burnout.

	▲	mhluongo 2 hours ago \| parent \| next [-]
		If you're not shipping faster, it's meaningless, and if you are, it's also bad?
	▲	scosman an hour ago \| parent \| prev [-]
		yup.

▲

newsicanuse 2 minutes ago | parent | prev | next [-]

At this point one might as well code by themselves

▲

dawnerd 24 minutes ago | parent | prev | next [-]

When I use ai to code this is pretty close to my workflow too but I find it ends up taking at best just as long as if I were to write the code myself. If m some cases I’ve thrown away what the ai has done and just done it myself. I think that’s just a skill people need to learn - at a certain point you have to cut your losses. I’ve seen some coworkers argue back and forth with an llm trying to get it to do something. Especially true on simpler changes.

▲

democracy 39 minutes ago | parent | prev | next [-]

Similar approach, but I also go a step further with some basic manual architecture/high level contract/stubs setups, just to keep it consistent with other systems (and easier reading as well).

▲

chrisweekly 2 hours ago | parent | prev | next [-]

You helpfully cite Claude w/ Opus 4.7 max and Codex w/ GPT5.5 xhigh fast, but what "AI" do you use for the initial design?

	▲	bottlepalm 2 hours ago \| parent [-]
		Claude primarily, though will sometimes get a second opinion from Codex.

▲

rootnod3 3 hours ago | parent | prev | next [-]

And then Anthropic has an outage and you what...have a coffee break until then? All that time babysitting the AIs just to be a little faster but probably with less knowledge/control over what they did?

▲

afavour 2 hours ago | parent | next [-]

I don’t think you’re quite getting what OP is describing. I work in a similar way… I am aware of all the code being written. If Claude had an outage I could write it myself. It would just take longer.

You say “all that time” babysitting AIs but in my experience it isn’t that much time, if anything the back and forth at the planning stages is more productive than when I’m doing it by myself because I’m being asked questions and having to think things through from different angles.

▲

gitaarik 7 minutes ago | parent | prev | next [-]

What do you do when your search engine goes down?

▲

efitz 2 hours ago | parent | prev | next [-]

If you only have one AI window open, you’re doing it wrong. You task swap to another window/agent, get it working on something, rinse and repeat. I can keep 4 busy most of the time. When I task swap I also check in on what the other agents are doing to make sure they’re on track, not blocked and not struggling.

▲

well_ackshually an hour ago | parent [-]

congratulations on your soon to be coming burnout.

Keeping that many tasks in parallel, running all the time will kill you.

▲

speff an hour ago | parent | next [-]

I suppose it depends how hands-off the tasks are - I max out at 2 parallel sessions working on different parts and it's fairly exhausting once done. I can see the number of parallel work increasing if there's a good dev/test loop. But at $WORK, that's not usually an option.

	▲	rootnod3 an hour ago \| parent [-]
		So, hands-off meaning "just let the AI cook and don't check it"? Either you follow everything it does, revise the plans, do the code review, manual adjustments, etc, or you run sessions in parallel, not being that attentive and constantly context-switch (also resulting in less attention I guess). I fail to see the benefits honestly.

▲

DonHopkins 31 minutes ago | parent | prev [-]

It's great to work from home so you can take nice little micro naps while code's generating, reviewing, building, and deploying.

A calm attentive alternative of vibe coding: restful coding.

It's much easier to read and review code after a refreshing cat nap, especially with a real cat.

Too bad that's not usually acceptable to do that in the office. It should be! Slacking off by sword fighting all day is too exhausting.

https://xkcd.com/303/

▲

jerezzprime 20 minutes ago | parent | prev | next [-]

Yes get a coffee. Being able to execute 5 things at once is amazing, but it's a recipe for burnout. We have to be more careful and explicit about how we spend our time, and that means more explicit time away. If this thing makes you 10x more effective (I truly believe it can), you can afford to spend 20% less time behind the desk and more time doing whatever it is that actually makes you happy. Hopefully your manager understands that calculus.

▲

bottlepalm 2 hours ago | parent | prev | next [-]

As the AI is working, I am working - reviewing, regression testing, thinking about if the currently implementation is too complex and how to simplify it etc.. I totally review and understand everything the AI is generating and often push back, have it re-do something, or do it myself. In the end I feel like the quality of the work is at a v3 level in the time it took to do a v1. The productivity and quality increase is real.

▲

comradesmith 2 hours ago | parent | prev | next [-]

I’ll deal with that problem when it happens

▲

raven12345 31 minutes ago | parent | prev | next [-]

You can have multiple tasks running

▲

refactor_master 2 hours ago | parent | prev | next [-]

We're already having coffee breaks when AWS and CloudFlare are down. What's another break in the mix? If anything, we might be lucky that they're down at the same time, so we can consolidate the breaks.

▲

mohamedkoubaa 2 hours ago | parent | prev | next [-]

And then solar radiation permanently knocks out the electrical grid and you what... have coffee break until society finds a new equilibrium?

▲

busterarm 8 minutes ago | parent | prev | next [-]

Company I'm familiar with that went all in on Codex ran out of tokens for a week and wouldn't increase their spend.

I pretty significant number of their engineers flat out refused to work. Like publicly said so. "Increase our plan or I'm taking the week off."

▲

8note 2 hours ago | parent | prev | next [-]

why not?

then demand some lack-of-uptime compensation for a lack of uptime

▲

wahnfrieden 2 hours ago | parent | prev | next [-]

Codex has 99.98% uptime

▲

glhaynes 2 hours ago | parent | prev | next [-]

"All that time babysitting the AIs just to be a little faster" doesn't seem like an accurate/unbiased portrayal of what they said: "The v1 feature feels more like a v3 given the amount of iteration it already went through."

▲

soupspaces 2 hours ago | parent | prev [-]

In Soviet Russia, the AI babysits you https://en.wikipedia.org/wiki/In_Soviet_Russia

▲

sunsetSamurai an hour ago | parent | prev | next [-]

maybe it's dumb question, but how do you feed the results of one agent to another? do you copy and paste manually? or how do you do it programmatically?

	▲	bottlepalm 6 minutes ago \| parent \| next [-]
		Yea I'll take the review feedback from one, validate it, and then copy/paste it into the other session saying like, "hey I got this feedback, what do you think?" So I'm not even telling the other AI the feedback is valid, I want it to independently validate it. Often the feedback is not like a bug, but a red flag, design consideration, or trade off. Often depending on how complex the feedback, I'll do it one at a time addressing each one individually. And after the feedback is addressed, I'll go back to the AI that generated the feedback and say like, "I handled 4/5 items you found, can you double check." It's similar to handling PR feedback, where you do it, validate it, but then still have to submit it for peer review.
	▲	kevinsync an hour ago \| parent \| prev \| next [-]
		When I pair Claude and Codex, I use claude-co-commands [0] to drive from Claude and talk to Codex via MCP. Lately I've found Codex has been far more consistent for my specific projects, so I've just been almost entirely inside Codex. YMMV [0] https://github.com/SnakeO/claude-co-commands
	▲	adrianN an hour ago \| parent \| prev \| next [-]
		Having the agents write their plans into text files and iterating on those works reasonably well.
	▲	DonHopkins 36 minutes ago \| parent \| prev [-]
		Just switch models whenever you want with the menu at the bottom of the chat window in Cursor. And maybe don't use tools that lock you into one model?

▲

nomel an hour ago | parent | prev | next [-]

I've noticed the following really helps (most important at end):

1. Have claude form the plan and converse with a simple "Note any concerns with this plan" type plan-critic agent.

2. Let it run.

3. After (with everything in context) have it make a future_recommendations.md.

4. Have it make a plan.md to implement those future recommendations, conversing with the plan critic..

5. Clear context. Repeat with 1. Do this loop a few times, with some feedback from actual review thrown in.

But, most importantly, because Claude will aggressively try to maintain code "as is", and happily build on it's previous crap, while preferring to hand roll implementations of everything, add something like this to memories/directives:

* When evaluating designs, default to "pull in the library" over "hand-roll it." Hand-rolling is much worse than a dependency.

* "Precedent" / "matches house style" / "reuses existing pattern" / "consistent with what we already do" are not valid engineering arguments.

* This project is still in the development stage with no real deployments. Mitigation costs and existing precedence are not a concern.

With these, in the last week that I've started using them (after inspecting the insane justifications for leaving crap design decisions in the plans), Claude went from junior level slop that required more oversight than it was worth to something very reasonable, using standard libraries, requiring nudges for architecture rather than pure "wtf!?".

I think they've fine tuned heavily towards "don't rewrite the codebase" tuning, which completely rational from multiple perspectives, but also not appropriate for new code.

I do enjoy a considerable daily token allowance, so this may not apply to everyone.

▲

vessenes 2 hours ago | parent | prev | next [-]

I have a very similar workflow, and experience similar temperaments from the agents. I also find anecdotally that they are moderately competitive - you get very different attention from them when you say "competitor X wrote this - please find all bugs" than when you say "you just wrote this - please find all bugs".

	▲	bottlepalm 2 hours ago \| parent [-]
		Hah yea I just told them I wrote it, or I reviewed it. I don't want to get the AI's in a pissing contest with each other because they will get distracted and try to show off.

▲

DonHopkins 38 minutes ago | parent | prev | next [-]

Low frequency defensive long drawn out back and forth bullet dodging vibe coding should be called "serpentine coding".

The In-Laws (1979): Getting off the plane in Tijuara:

https://www.youtube.com/watch?v=A2_w-QCWpS0

▲

i_love_retros an hour ago | parent | prev | next [-]

This all sounds insane. If it requires so much back and forth with the AI why on earth wouldn't you just write the code yourself? At least then you build the mental model of the code and keep your brain healthy. Reading the comments in here about all the hoops people are having to jump through just to do the same thing they were doing a year ago without AI... and spending a fortune to do it! I think you've all got AI psychosis.

	▲	democracy 36 minutes ago \| parent [-]
		You can be right but quite often it helps keeping focus on the forrest rather then getting lost in the trees - at least for me. Boilerplate steals a lot of attention, focus and can just be mentally exhausting.

▲

skydhash 2 hours ago | parent | prev [-]

That sounds too much like three weeks of work saving you three hours of planning.

In my experience, software engineering is a matter of knowledge. Understanding it and then coming up with a solution. The latter is a flash of insight that comes mostly from experience. Then you gather more information to flesh it out, or brainstorm it with your colleagues.

What you're describing sounds more like a ritual of doing busy work than anything practical. Because tasks vary so much. A feature may be huge, but you take care of it in a day with copy pasting because you already have all the building blocks in other files. And something may be twenty lines of code, but you spent the whole week sweating on it (concurrency stuff maybe). Those ritualistic workflows sounds more like someone imagining software development than actually doing it.