Remix.run Logo
syntaxing 4 hours ago

While Anthropic can choose whatever tool uses their api or subscription but I never fully understood what they gain from having the subscription explicitly only work for claude code. Is the issue that it disincentivizes the use of their API?

Aurornis 3 hours ago | parent | next [-]

It’s basic market segmentation.

They gave Claude Code a discount to make it work as a product.

The API is priced for all general purpose usage.

They never sold the Claude Code endpoint as a cheaper general purpose API. The stories about “blocking OpenCode” are getting kind of out of hand because they’d block any use of the Claude Code endpoint that wasn’t coming from their Claude Code tool.

drakenot 4 hours ago | parent | prev | next [-]

Perhaps concentrated use of Claude Code increases their perceived market value.

It also perhaps tries to preserve some moat around their product/service.

conception 4 hours ago | parent [-]

And telemetry and tooling reports and usage by cloud code signs PR on GitHub and things like that.

_boffin_ 4 hours ago | parent | prev | next [-]

Are they ZDR with prompts and completions and possibly rely on usage statistics from their CLI to infer how people are using it?

Palmik 2 hours ago | parent [-]

Not at all. They train on your prompts and codebase unless you opt out.

paxys 4 hours ago | parent | prev | next [-]

Owning the client gives them full control over which model to use for which query, prompt caching, rate limiting and lots more. So they can drive massive savings for the ~same output over just giving unrestricted access to the API.

syntaxing 4 hours ago | parent [-]

Wouldn’t most of the savings be done on the server side anyway? I would be very surprised if Claude code does those on the client side.

ankit219 4 hours ago | parent | prev [-]

The issue is that claude code is cheap because it uses API's unused capacity. These kind of circumventions hurt them both ways, one they dont know how to estimate api demand, and two, the nature of other harnesses is more bursty (eg: parallel calls) compared to claude code, so it screws over other legit users. Claude code very rarely makes parallel calls for context commands etc. but these ones do.

re the whole unused capacity is the nature of inference on GPUs. In any cluster, you can batch inputs (ie takes same time for say 1 query or 100 as they can be parallelized) and now continuous batching[1] exists. With API and bursty nature of requests, clusters would be at 40%-50% of peak API capacity. Makes sense to divert them to subscriptions. Reduces api costs in future, and gives anthropic a way to monetize unused capacity. But if everyone does it, then there is no unused capacity to manage and everyone loses.

[1]: https://huggingface.co/blog/continuous_batching

blitzar 4 hours ago | parent | next [-]

Your suggested functionality is server side, not client side.

> it uses API's unused capacity

I see no waiting or scheduling on my usage - it runs, what appears to be, full speed till I hit my 4 hour / 7 day limit and then it stops.

Claude code is cheap (via a subscription) because it is burning piles of investor cash, while making a bit back on API / pay per token users.

ankit219 4 hours ago | parent [-]

Why would scheduling be a thing in this case? I might be missing something here.

With continuous batching, you don't wait for entire previous batch to finish. The request goes in as one finishes. Hence the wait time is negligible.

ehsanu1 4 hours ago | parent | prev [-]

They have rate limits for this purpose. Many folks run claude code instances in parallel, which has roughly the same characteristics.

ankit219 3 hours ago | parent [-]

Not the same.

they have usage limits on subscription. I dont know about rate limits. Certainly not per request.