Remix.run Logo
gck1 2 hours ago

No. I run a similar setup and with $200 subscription, I usually hit weekly quota by around day 3-4. My approach is 4-5 hours of extreme human in the loop spec sessions with opus and codex:

1. We discuss every question with opus, and we ask for second opinion from codex (just a skill that teaches claude how to call codex) where even I'm not sure what's the right approach 2. When context window reaches ~120k tokens, I ask opus to update the relevant spec files. 3. Repeat until all 3 of us - me, opus and codex are happy or are starting to discuss nitpicks, YAGNIs. Whichever earlier.

Then it's fully autonomous until all agents are happy.

Which is why I'm exploring optimization strategies. Based on the analysis of where most of the tokens are spent for my workflow, roughly 40% of it is thinking tokens with "hmm not sure, maybe..", 30% is code files.

So two approaches: 1. Have a cheap supervisor agent that detects that claude is unsure about something (which means spec gap) and alerts me so that I can step in 2. "Oracle" agent that keeps relevant parts of codebase in context and can answer questions from builder agents.

And also delegating some work to cheaper models like GLM where top performance isn't necessary.

You'll notice that as soon as you reach a setup you like that actually works, $200 subscription quotas will become a limiting factor.

hinkley 2 hours ago | parent [-]

That does seem to argue for the checkpointing strategy of having the agent explain their plan and then work on it incrementally. When you run out of tokens you either switch projects until your quota recovers or you proceed by hand until the quota recovers.

I also kinda expect that one of the saner parts of agentic development is the skills system, that skills can be completely deterministic, and that after the Trough of Disillusionment people will be using skills a lot more and AI a lot less.

gck1 2 hours ago | parent [-]

Yes on both counts. Implementation plan is a second layer after the spec is written, at which point, spec can't be changed by agents. I then launch a planner agent that writes a phased plan file and each builder can only work on a single phase from that file.

So it's spec (human in the loop) > plan > build. Then it cycles autonomously in plan > build until spec goals are achieved. This orchestration is all managed by a simple shell script.

But even with the implementation plan file, a new agent has to orient itself, load files it may later decide were irrelevant, the plan may have not been completely correct, there could have been gaps, initial assumptions may not hold, etc. It then starts eating tokens.

And it feels like this can be optimized further.

And yes on deterministic tooling as well.