Remix.run Logo
vardalab 4 hours ago

Have frontier lab do the plan which is the most time consuming part anyways and then local llm do the implementation. Frontier model can orchestrate your tickets, write a plan for them and dispatch local llm agents to implement at about 180 tokens/s, vllm can probably ,manage something like 25 concurrent sessions on RTX 6000 Do it all in a worktrees and then have frontier model do the review and merge. I am just a retired hobbyist but that's my approach, I run everything through gitea issues, each issue gets launched by orchestrator in a new tmux window and two main agents (implementer and reviewer get their own panes so I can see what's going on). I think claude code now has this aspect also somewhat streamlined but I have seen no need to change up my approach yet since I am just a retired hobbyist tinkering on my personal projects. Also right now I just use claude code subagents but have been thinking of trying to replace them with some of these Qwen 3.5 models because they do seem cpable and I have the hardware to run them.