Remix.run Logo
_pdp_ 9 hours ago

I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

I mean, I am sure they don't mean it but they have the incentive to burn as much tokens as they are allowed to get away with. Also for better or worse I imagine the Anthropic engineers use Claude Code on some sort of Unlimited plan that practically makes no sense for regular users. So adding a 100k tokens is not a big deal.

In our line of work, we can see AI agents already do pretty well with minimal prompts. Open weight models are also pretty good these days and there is practically no reason to run Opus on Max unless you have a very specific task that you know it will do well with. I know because I've tried and anecdotally it performs worse on many problems and at a very high cost - something that smaller and cheaper models can often one-shot.

lukeschlather 8 hours ago | parent | next [-]

I don't think we've agreed to anything. That said I think paying for something like Claude Code makes a lot of sense because you can outsource the question of "how many tokens should I use per hour and how should I use them?" to the people providing the tokens.

If you want to plug your API keys into a third-party harness, that's totally cool and honestly, I'm looking into doing that right now and I haven't used any of the first-party harnesses at all. But the first time I accidentally spend $300 in a day I may be thinking about how a $20/month plan might be pretty good even if performance is inconsistent, at least I know what my costs are.

margalabargala 8 hours ago | parent | prev | next [-]

> I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

It's because the subscriptions force you to do so. The subscriptions are the most economical way to use e.g. Claude by close to an order of magnitude. If you max out a 20x plan every week, doing the same work with the API would cost you well into the four figures.

Anyone already using the Claude API pricing and using CC over OpenCode is kneecapping themselves.

esperent 7 hours ago | parent | next [-]

I switched over to codex with pi last week. Even though I strongly dislike OpenAI and I hope this is a temporary solution, they're the only one of the frontier models that let me use my own harness and after recent CC shenanigans I'm done with proprietary harnesses.

The immediate thing I've noticed: I get way more out of the codex $100 plan than I was getting out of the Anthropic $200. Like, probably 2x at least.

The other think I've noticed: when using strict guardrails, TDD, reviews etc. I cannot notice any quality difference. Not only between Opus and Codex but even between the most recent models - GPT 5.3 code, GPT 5.4, and now GPT 5.5.

Well, 5.5 uses a huge amount of my session limits. 5.3 is very light, 5.4 somewhere in between. So now I use 5.4 for the main session/debugging/planning and then execute with 5.3.

Regarding usage, of course, it's hard to say how much is the model and how much is coming from Claude code and all this ridiculous malware scanning.

But it's nice to use a lightweight harness like pi and see that even with all my personal instructions, a good bunch of skills, custom tools etc., if I start a session and say "hi" I'm starting out with about 15k of context used. I think a closely equivalent setup in CC would start at 30-40k context.

gwerbin 6 hours ago | parent [-]

What's your Pi setup?

esperent 2 hours ago | parent [-]

Probably not that different to everyone else's plan -> tdd -> review loops.

_pdp_ 8 hours ago | parent | prev [-]

Correct. However, last time I checked enterprise customers are moving to metered billing. GitHub also decided to so. So it seems the subsidy is coming to an end? I don't know.

p0w3n3d 32 minutes ago | parent | prev | next [-]

yeah, classic conflict of interest.

However nobody is agreeing with that, that's how it's done, and move faster faster, because of goldrush! faster!@@@!

vineyardmike 9 hours ago | parent | prev | next [-]

This is why the subscriptions are important. When the usage is (vaguely) unmetered, the provider has an incentive to make usage cheap on marginal use.

It aligns the incentives for faster, cheaper, terse and more reliable models, because the model providers pay the wasted tokens and electricity costs.

jdiff 8 hours ago | parent [-]

That would seem to misalign the incentives in the opposite direction. Cut corners, reduce costs by any means necessary even to the detriment of performance. One of the most common comments I see here on the release of a new Anthropic model is that everyone better enjoy the 48 hours of access to an un-nerfed model before the cost cutting sets in.

Grimburger 8 hours ago | parent | prev | next [-]

> adding a 100k tokens is not a big deal

Did you mean 100 billion tokens because 100k isn't a big deal at all?

serf 8 hours ago | parent | prev | next [-]

>I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

the best performing and capable ones are all the ones that aren't tied to a specific api.

charcircuit 2 hours ago | parent | prev | next [-]

It makes perfect sense to me for an AI system to be vertically owned that way you can do vertical optimization.

ikiris 8 hours ago | parent | prev | next [-]

no, they have incentive to charge as much as they want, butt they have massive costs / capacity constraints per token, if anything they have a major incentive to reduce them because they literally cannot meet demand.

varispeed 9 hours ago | parent | prev [-]

They also have incentive to nerf models occasionally, so they rarely one shot the task and more often they do it wrong and then you have to spend on tokens to correct it. Bonus points if model suddenly goes completely dumb then you have to start the session over.