Remix.run Logo
rootnod3 7 hours ago

Sorry, so the tool is now even circumventing human review? Is that the goal?

So the agent can now merge shit by itself?

Just the let damn thing push nto prod by itself at this point.

blutoot 2 hours ago | parent | next [-]

At a scale, I don't see a net negative of AI merging "shit by itself" if the developer (or the agent) is ensuring sufficient e2e, integration and unit test coverage prior to every merge, if in return I get my team to crank out features at a 10x speed.

The reality is that probably 99.9999% of code bases on this earth (but this might drop soon, who knows) pre-date LLMs and organizing them in a way that coding agents can produce consistent results from sprint to sprint, will need a big plumbing work from all dev teams. And that will include refactoring, documentation improvements, building consensus on architectures and of course reshaping the testing landscape. So SWE's will have a lot of dirty work to do before we reach the aforementioned "scale".

However, a lot of platforms are being built from ground-up today in a post-CC (claude code) era . And they should be ready to hit that scale today.

dsifry 2 hours ago | parent | next [-]

Yup! Software engineers aren't going to be out of work anytime soon, but I'm acting more like a CTO or VPE with a team of agents now, rather than just a single dev with a smart intern.

tadfisher 44 minutes ago | parent | prev [-]

I hate this paradigm because it pits me against my tools as if we're adversaries. The tools are prone to rewrite or even delete the tests, so we have to write other tools to sandbox agents from each other and check each others' work, and I just don't see a way to get deterministically good results over just building shit myself. It comes down to needing high trust in my tools to feel confident in what we're shipping.

blutoot 22 minutes ago | parent [-]

The key is that at the end of the day productivity is king which is a polite term for cutting head count and/or delivering at a ridiculously higher velocity.

You can deterministically always get good results at your pace. But most likely, you won't achieve that at the speed and scale that a coding agent running in 4-5 worktrees, 24/7 without food or toilet breaks, especially if the latter will mostly help achieve the product/business goals at an "OK" quality (in which case you will perhaps be measured by how good you can steer these agents to elevate that quality from "OK" without sacrificing scale too much).

ljm 6 hours ago | parent | prev | next [-]

Someone’s gonna think about wiring all this up to Linear or Jira, and there’ll be a whole new set of vulnerabilities created from malicious bug reports.

dsifry 2 hours ago | parent [-]

That's why I intentionally don't have this hooked into an ingest flow - you still get control over what issues/stories you want the agent swarm to work on... Just now, I can know that the code that was written has been reviewed and all comments have been fully addressed!

dsifry 2 hours ago | parent | prev | next [-]

No, it just prepares the PR - it doesn't automatically merge. That would be very dangerous, imho!

literalAardvark 6 hours ago | parent | prev | next [-]

In some workflows it's helpful for the full loop to be automated so that the agent can test if what's done works.

And you can do a more exhaustive test later, after the agents are done running amok to merge various things.

dsifry 2 hours ago | parent [-]

Exactly right!

baxtr 6 hours ago | parent | prev | next [-]

I’m not saying this is, but if I were a malicious state actor, that’s exactly the kind of thing I’d like to see in widespread use.

danenania 6 hours ago | parent | prev | next [-]

I don’t think “ready to merge” necessarily means the agent actually merges. Just that it’s gone as far as it can automatically. It’s up to you whether to review at that point or merge, depending on the project and the stakes.

If there are CI failures or obvious issues that another AI can identify, why not have the agent keep going until those are resolved? This tool just makes that process more token efficient. Seems pretty useful to me.

dsifry 2 hours ago | parent [-]

That's EXACTLY right. Ready to merge is an important gate, but it is very stupid to just merge everything without further checks/testing by a human!

tayo42 3 hours ago | parent | prev | next [-]

No,

The linked page explains how this fits into a development workflow

eg.

> A reviewer wrote “consider using X”… is that blocking or just a thought?

> AMBIGUOUS - Needs human judgment (suggestions, questions)

dsifry 2 hours ago | parent [-]

Right! It doesn't assume that all comments are actionable, or need to be worked on. However, if you allow anyone to comment on your PRs, it could be a malicious vector. So don't let anyone review PRs on projects that you care about!!!

glemion43 an hour ago | parent | prev [-]

Man if you are so frustrated by AI just stop reading articles relevant to it if you don't even take the time to read it properly.

And yes there are plenty of use cases were ai code doesn't hurt anyone even if it gets merged automatically...

See it as an interesting new field of r&d...