| ▲ | joshribakoff 12 hours ago |
| This is just sub agents, built into Claude. You don’t need 300,000 line tmux abstractions written in go. You just tell Claude to do work in parallel with background sub agents. It helps to have a file for handing off the prompt, tracking progress, and reporting back. I also recommend constraining agents to their own worktrees. I am writing down the pattern here https://workforest.space while nearly everyone is building orchestrators i also noticed claude is already the best orchestrator for claude. |
|
| ▲ | skippyboxedhero 9 hours ago | parent | next [-] |
| It isn't sub agents. The gap with existing tooling is that the abstraction is over a task rather than a conversation (due to the issue with third-party apps, Claude Code has been inherently limited to conversations which is why they have been lacking in this area, Claude Code Web was the first move in this direction), and the AI is actually coordinating the work (as opposed to being constantly prompted by the user). One of the issues that people had which necessitated this feature is that you have a task, you tell Claude to work on it, and Claude has to keep checking back in for various (usually trivial) things. This workflow allows for more effective independent work without context management issues (if you have subagents, there is also an issue with how the progress of the task is communicated by introducing things like task board, it is possible to manage this state outside of context). The flow is quite complex and requires a lot of additional context that isn't required with chat-based flow, but is a much better way to do things. The way to think about this pattern - one which many people began concurrently building in the past few months - is an AI which manages other AIs. |
| |
| ▲ | vidarh 7 hours ago | parent | next [-] | | It isn't "just" sub agents, but you can achieve most of this just with a few agents that take on generic roles, and a skill or command that just tells claude to orchestrate those agents, and a CLAUDE.md that tells it how to maintain plans and task lists, and how to allow the agents to communicate their progress. It isn't all that hard to bootstrap. It is, however, something most people don't think about and shouldn't need to have to learn how to cobble together themselves, and I'm sure there will be advantages to getting more sophisticated implementations. | | |
| ▲ | skippyboxedhero 6 hours ago | parent [-] | | Right, but the model is still: you tell the AI what to do, this is the AI tells other AIs what to do. The context makes a huge difference because it has to be able to run autonomously. It is possible to do this with SDK and the workflow is completely different. It is very difficult to manage task lists in context. Have you actually tried to do this? i.e. not within a Claude Code chat instance but by one-shot prompting. It is possible that they have worked out some way to do this, but when you have tens of tasks, merge conflicts, you are running that prompt over months, etc. At best, it doesn't work. At worst, you are burning a lot of tokens for nothing. It is hard to bootstrap because this isn't how Claude Code works. If you are just using OpenRouter, it is also not easy because, after setting up tools/rebuilding Claude Code, it is very challenging to setup an environment so the AI can work effectively, errors can be returned, questions returned, etc. Afaik, this is basically what Aider does...it is not easy, it is especially not easy in Claude Code which has a lot of binding choices from the business strategy that Anthropic picked. | | |
| ▲ | vidarh 4 hours ago | parent | next [-] | | > Have you actually tried to do this? i.e. not within a Claude Code chat instance but by one-shot prompting. You ask if I've tried to do this, and then set constraints that are completely different to what I described. I have done what I described. Several times for different projects. I have a setup like that running right now in a different window. > It is hard to bootstrap because this isn't how Claude Code works. It is how Claude Code works when you give it a number of sub-agents with rules for how to manage files that effectively works like task queues, or skills/mcp servers to interact with communications tools. > it is not easy It is not easy to do in a generic way that works without tweaks for every project and every user. It is reasonably easy to do for specific teams where you can adjust it to the desired workflows. | |
| ▲ | ukuina 5 hours ago | parent | prev [-] | | It's natural to assume that subagents will scale to the next level of abstraction; as you mentioned, they do not. The unlock here is tmux-based session management for the teammates, with two-way communication using agent inbox. It works very well. |
|
| |
| ▲ | adastra22 7 hours ago | parent | prev [-] | | > Claude Code has been inherently limited to conversations How so? I’ve been using “claude -p” for a while now. But even within an interactive session, an agent call out is non-interactive. It operates entirely autonomously, and then reports back the end result to the top level agent. | | |
| ▲ | skippyboxedhero 7 hours ago | parent [-] | | Because of OAuth. If they gave people API keys then no-one buys their ludicrously priced API product (I assume their strategy is to subsidise their consumer product with the business product). You can use Claude Code SDK but it requires a token from Claude Code. If you use this token anywhere else, your account gets shut down. Claude -p still hits Claude Code with all the tools, all the Claude Code wrapping. | | |
| ▲ | adastra22 4 hours ago | parent | next [-] | | That’s not what this subthread is about. They’re talking about the subagent within Claude Code itself. Btw, you can use the Claude Agent SDK (the renamed Claude Code SDK) with a subscription. I can tell you it works out of the box, and AFAIK it is not a ToS violation. | |
| ▲ | tobyjsullivan 6 hours ago | parent | prev | next [-] | | I believe they’re talking about Claude Code’s built-in agents feature which works fine with a Max subscription. https://code.claude.com/docs/en/sub-agents Are you talking about the same thing or something else like having Claude start new shell sessions? | |
| ▲ | TeMPOraL 5 hours ago | parent | prev | next [-] | | > If they gave people API keys then no-one buys their ludicrously priced API product The main driver for those subscriptions is that their monthly cost with Opus 3.7 and up pays itself back in couple hours of basic CC use, relative to API prices. | |
| ▲ | blibble 5 hours ago | parent | prev [-] | | can't you just rip the oauth client secret out of the code? |
|
|
|
|
| ▲ | stingraycharles 11 hours ago | parent | prev | next [-] |
| It’s even less of a feature, Claude Code already has subagents; this new feature just ensures Claude Code actually uses this for implementation. imho the plans of Claude Code are not detailed enough to pull this off; they’re trying to do it to preserve context, but the level of detail in the plans is not nearly enough for it to be reliable. |
| |
| ▲ | ctoth 8 hours ago | parent | next [-] | | I agree with this. Any time I make a plan I have to go back and fill it in, fill it in, what did we miss, tdd, yada yada. And yes, I have all this stuff in CLAUDE.md. You start to get a sense for what size plan (in kilobytes) corresponds to what level of effort. Verification adds effort, and there's a sort of ... Rocket equation? in that the more infrastructure you put in to handle like ... the logistics of the plan, the less you have for actual plan content, which puts a cap on the size of an actionable plan. If you can hit the sweet spot though... GTFO. I also like to iterate several times in plan mode with Claude before just handing the whole plan to Codex to melt with a superlaser. Claude is a lot more ... fun/personable to work with? Codex is a force of nature. Another important thing I will do is now that launching plans clear context, it's good to get out of planning mode early, hit an underspecified bit, go back into planning mode and say something like "As you can see the plan was underspecified, what will the next agent actually need to succeed?" and iterate that way before we actually start making moves. This is made possible by lots of explicit instructions in CLAUDE.md for Claude to tell me what it's planning/thinking before it acts. Suppressing the toolcall reflex and getting actual thought out helps so much. | |
| ▲ | tobyjsullivan 6 hours ago | parent | prev | next [-] | | It’s moving fast. Just today I noticed Claude Code now ends plans with a reference to the entire prior conversation (as a .jsonl file on disk) with instructions to check that for more details. Not sure how well it’s working though (my agents haven’t used it yet) | |
| ▲ | dceddia 10 hours ago | parent | prev [-] | | Interesting about the level of detail. I’ve noticed that myself but I haven’t done much to address it yet. I can imagine some ideas (ask it for more detail, ask it to make a smaller plan and add detail to that) but I’m curious if you have any experience improving those plans. | | |
| ▲ | stingraycharles 9 hours ago | parent | next [-] | | I’m trying to solve this myself by implementing a whole planner workflow at https://github.com/solatis/claude-config Effectively it tries to resolve all ambiguities by making all decisions explicit — if the source cannot be resolved to documentation or anything, it’s asked to the user. It also tries to capture all “invisible knowledge” by documenting everything, so that all these decisions and business context are captured in the codebase again. Which - in theory - should make long term coding using LLMs more sane. The downside is that it takes 30min - 60min to write a plan, but it’s much less likely to make silly choices. | |
| ▲ | vardalab 7 hours ago | parent | prev | next [-] | | I iterate around issues. I have a skill to launch a new tmux window for worktree with Claude in one pane and Codex in another pane with instructions on which issue to work on, Claude has instructions to create a plan, while Codex has instructions to understand the background information necessary for this issue to be worked on. By the time they're both done, then I can feed Claude's plan into Codex, and Codex is ready to analyze it. And then Codex feeds the plan back to Claude, and they kind of ping pong like that a couple times. And after a certain or several iterations, there's enough refinement that things usually work.
Then Claude clears context and executes the plan. Then Codex reviews the commit and it still has all the original context so it knows what we have been planning and what the research was about the infrastructure. And it does a really good job reviewing. And again, then they ping pong back and forth a couple times, and the end product is pretty decent.
Codex's strength is that it really goes in-depth. I usually do this at a high reasoning effort. But Codex has zero EQ or communication skills, so it works really well as a pedantic reviewer. Claude is much more pleasant to interact with. There's just no comparison. That's why I like planning with Claude much more because we can iterate..
I am just a hobbyist though. I do this to run my Ansible/Terraform infrastructure for a good size homelab with 10 hosts. So we actually touch real hardware a lot and there's always some gotchas to deal with. But together, this is a pretty fun way to work. I like automating stuff, so it really scratches that itch. | |
| ▲ | colelyman 8 hours ago | parent | prev [-] | | I have had good success with the plans generated by https://github.com/obra/superpowers I also really like the Socratic method it uses to create the plans. |
|
|
|
| ▲ | AffableSpatula 12 hours ago | parent | prev | next [-] |
| Claude already had subagents. This is a new mode for the main agent to be in (bespoke context oriented to delegation), combined with a team-oriented task system and a mailbox system for subagents to communicate with each other. All integrated into the harness in a way that plugins can't achieve. |
| |
| ▲ | theturtletalks 9 hours ago | parent [-] | | Wow there goes a lot of harnesses out the window. The main limitation of subagents was they couldn’t communicate back and forth with the main agent. How do we invoke swarm mode in Claude Code? |
|
|
| ▲ | apsurd 10 hours ago | parent | prev | next [-] |
| OT: Your visual on "stacked PRs" instantly made me understand what a stacked PR is. Thank you! I had read about them before but for whatever reason it never clicked. Turns out I already work like this, but I use commits as "PRs in the stack" and I constantly try to keep them up to date and ordered by rebasing, which is a pain. Given my new insight with the way you displayed it, I had a chat with chatGPT and feel good about giving it a try: 1. 2-3 branches based on a main feature branch
2. can rebase base branch with same frequency, just don't overdo it, conflicts should be base-isolated.
3. You're doing it wrong if conflicts cascade deeply and often
4. Yes merge order matters, but tools can help and generally the isolation is the important piece
|
| |
| ▲ | abhinavg 6 hours ago | parent | next [-] | | If you’re interested in exploring tooling around stacked PRs, I wrote git-spice (https://abhinav.github.io/git-spice/) a while ago. It’s free and open-source, no strings attached. | |
| ▲ | Griffinsauce 9 hours ago | parent | prev | next [-] | | If you're rebasing a lot, definitely set up rerere (reuse recorded solution) - it improves things enormously. Do make sure you know how to reset the cache, in case you did a bad conflict resolution because it will keep biting you. Besides that caveat it's a must. | |
| ▲ | byproxy 9 hours ago | parent | prev [-] | | Isn’t this just “Gitflow”? https://www.atlassian.com/git/tutorials/comparing-workflows/... | | |
| ▲ | apsurd 9 hours ago | parent | next [-] | | After a quick read it seems like gitflow is intended to model a release cycle. It uses branches to coordinate and log releases. Stacking is meant to make development of non-trivial features more manageable and more likely to enter main safer and faster. it's specific to each developer's workflow and wouldn't necessarily produce artifacts once merged into main (as gitflow seems to intentionally have a stance on) | |
| ▲ | withinboredom 9 hours ago | parent | prev [-] | | Please don’t use git-flow. Every time I see it, it looks like an over-engineer’s wet dream. | | |
| ▲ | seff 9 hours ago | parent [-] | | Can you say more as to why? The concept is not complex and in our situation at least provides a lot of benefits. | | |
| ▲ | jdxcode 8 hours ago | parent | next [-] | | I think the guy that created it has even stated he thinks it's a bad idea | |
| ▲ | withinboredom 8 hours ago | parent | prev [-] | | Literally the reason’s for git’s existence is to make merging diverging histories less complicated. Adding back the complexity misses the point entirely. |
|
|
|
|
|
| ▲ | mkw5053 10 hours ago | parent | prev | next [-] |
| Yeah, since they introduced (possibly async) subagents, I've had my main claude instance act as a manager overseeing implementation agents, keeping it's context clean, and ensuring everything goes to plan in the highest quality way. |
| |
| ▲ | AffableSpatula 10 hours ago | parent [-] | | yep this is exactly how I use the main agent too, I explicitly instruct to only ever use background async subagents. Not enough people understand that the claude code harness is event driven now and will wake up whenever these subagent completion events happen. |
|
|
| ▲ | bradgessler 9 hours ago | parent | prev [-] |
| Any recommendations on sandboxing agents? Last time I asked folks recommended docker. |
| |