Remix.run Logo
SwellJoe 7 hours ago

I'm ambivalent about that. I've seen people use beads, and they're just making busy work for the agents, splitting stuff up into tiny tasks that could have been one-shotted as part of the larger plan. They seem to just enjoy making thinky machine go brrr, even when it makes the work take longer and burn a lot more tokens.

I tend to think developing with agents should look at lot like managing a human (like, I use feature-branch development with PRs and review them, even on my own projects that have no other devs and don't need a paper trail for security audit purposes), so I theoretically can get down with an issue based process, but thus far I haven't seen it done in a way that isn't just making busy work for agents.

giancarlostoro 5 hours ago | parent [-]

I started with Beads, then wound up building my own:

https://github.com/Giancarlos/guardrails

Key things: I added a concept called "gates" which are tied to all tasks, it forces the agent to do arbitrary requirements such as: ensure it still runs / compiles, run all tests, ensure they pass, review existing tests critically and point out if they're not comprehensive enough, and finally, get human confirmation on the task. Until the human confirms, just work on another task and so on.

I didn't like that Beads was built on top of Git, I don't always work on git friendly projects, and beads kept getting messed up if I switched branches. So I made mine SQLite based. I also made it so you can sync to github issues, and sync pre-existing (and new) github issues as guardrails tasks to be worked on, the agent will even leave a comment for you on github when it grabs an issue in order to let others know the work will be done potentially.

waterproof 3 hours ago | parent [-]

nice concept! Beads did not age all that well, and Claude doesn't really want to use it since the TodoList upgrade.

Do you have any tricks for getting Claude to use guardrails effectively alongside (or instead of) TodoList?

giancarlostoro an hour ago | parent [-]

It works hand in hand to be honest, because Claude will read tickets that match criteria of what I'm looking to work on, and tack them on to its todo list, it just becomes and overview of my tasks.