Remix.run Logo
petra 5 hours ago

Maybe, for some projects, instead of generating code with it, it would be useful to generate a plan and the loop(tests/formal verification),because those take much less tokens than a full project, and than use the loop using the older models ?

Congeec 4 hours ago | parent | next [-]

Yes, I've been using Opus to write a plan and fanout sonnet subagents to implement it. Cheaper and faster

hirvi74 4 hours ago | parent [-]

What about quality? Being cheaper and faster, while great and all, is less valuable than quality to me.

mohamedkoubaa an hour ago | parent | next [-]

You can always have opus review the result at the end

dakolli 2 hours ago | parent | prev [-]

All the code an LLM produces is of questionable quality, so I'm not sure why you'd prioritize quality over speed. Speed is their only value add.

zakisaad 2 hours ago | parent | next [-]

This is a wild take. All cars can perfectly drive around a track, so why would you ever want an F1 car?

icedchai 2 hours ago | parent | prev | next [-]

Have you compared models across providers? The quality, for the same task, varies tremendously. If you don't prioritize quality you're wasting your own time when you inevitably have to re-do the code...

mohamedkoubaa an hour ago | parent | prev [-]

All code is of questionable quality.

There I fixed it

meco 4 hours ago | parent | prev | next [-]

This is the goal behind Devin Fusion, pretty good results so far I think.

https://cognition.com/blog/devin-fusion

iririririr 4 hours ago | parent [-]

so, pretty much undo the "magic" that the harness is for

xtracto 3 hours ago | parent | prev | next [-]

Has anyone experimented with Batch Processing? According to https://claude.com/pricing#api using Batch processing cuts the price 50%. So I wonder if any of the harnesses like OpenCode/Pi or similar could be made to use that for planning or similar.

bob778 3 hours ago | parent [-]

Batch can take up to 24 hours (and often does) and may never complete if it gets cancelled so it’d be hard to build a user workflow around unless you kick off planning on Friday and come back Monday

beastman82 4 hours ago | parent | prev | next [-]

this is the idea of opusplan https://code.claude.com/docs/en/model-config#opusplan-model-...

yieldcrv 4 hours ago | parent [-]

Article has a section about context window size settings

I love not getting compacted so often, but 1M context is trash right now, the degradation in speed and quality is too great above ~600k context

Not different than what everyone knows, but the 1M context is masqueraded as an innovation the same way 64k context used to be to 8k context

nonethewiser 4 hours ago | parent | prev | next [-]

Isn't that the kind thing its best at as well? Art least comparatively with other models. The more agentic stuff. Planning, tool orchestration, etc.

giancarlostoro 4 hours ago | parent | prev | next [-]

I think that's the idea, I saw some outrage on reddit about Fable using Opus to do code writing, another comment said exactly my reaction, why do you want to pay double for tool calling when Opus is just fine for the task?

sajithdilshan 4 hours ago | parent | prev | next [-]

But wouldn't that still result in higher token usage to scan the code base and figure out the changes and generate the plan? In my experience sometimes Opus launchs a Haiku sub-agent to explore the code base, but it's not gaurenteed.

Marha01 4 hours ago | parent | prev | next [-]

Yes, I do this all the time in Cline. It supports automatic model change when switching from Plan mode to Act (implementation) mode. Opus for planning and Sonnet for implementation. It works great.

4 hours ago | parent | prev [-]
[deleted]