I use 3 AI's (Claude, GPT and Gemini) to review each other's design plans and implementation on the same code base. Each often catches problems the others miss.

I try to make sure the architecture docs of the code base are refreshed regularly based on recent changes, so it's easier for humans and AI agents to make sense of the code.

I also regularly stop all other developments and just focus on auditing the code base with these AI's to make sure they are secure, robust, clean, and well structured and well tested -- some refactoring would be needed most of the time, and it's well worth it.

With this approach, nowadays I often merge code from AI without completely understanding what it's doing, but seems the code has been working so far. :)

▲

BobbyTables2 2 hours ago | parent | next [-]

You’ve transitioned from “individual contributor” to “manager”! (;->

	▲	wwind123 2 hours ago \| parent [-]
		Haha, true! I do sometimes have to steer the discussions between the AI's to the right direction, if they deviate too far away from the real problem, either because they miss some context, or because my original description of the problem was misleading. To do that formally, I have a mechanism built-in the review loop where if a comment on a github issue or PR is signed as "-- Human Reviewer", then all AI agents have to treat the comment as the highest priority item to address.

▲

kajman an hour ago | parent | prev | next [-]

I'm always curious when I see these stories. How long have you been doing this, for what sort of work, and was the codebase mature before you began working like this?

	▲	wwind123 an hour ago \| parent [-]
		Yeah, this one is easy: I have been doing this for half a year. I have a couple of projects worked out this way, all green-field projects, code base grew from 0 to tens of thousand of lines each.

▲

jimbobimbo 2 hours ago | parent | prev [-]

This is the way. I use gh copilot and have opus interrogate me and write the plan, then gpt review the plan and provide feedback; repeat this multiple times until gpt is either satisfied or starts to nitpick on unimportant stuff. Then sanity check the plan myself and have gpt implement it.

Each implementation is also reviewed by me before merging to master. I complete PRs only when I'm satisfied with the implementation, my feedback is addressed, and I fully understand what is going on. Agents are the replacement for typing and productivity multipliers.

I have big picture view of the product, each plan implements only a part of it, scoped to avoid merging unreviwed slop. Probably slower, but result is much better.

	▲	wwind123 an hour ago \| parent [-]
		Cool. Yeah it's important to have a big picture of the product, to steer the AI's towards the right direction in their work.