You are making the false assumption that all token consumption costs the same when it doesn't. Yes in the limit the price to serve the model and generate a response is O(tokens), but when tokens is smaller it can be cheaper to generate a new token than when tokens is bigger. If other harnesses prompt with more tokens than Claude Code it can be more expensive to serve.

▲

stavros 6 hours ago | parent [-]

They have limits. I don't care how expensive it is to serve, I'm paying them for a given amount of tokens (a limit which THEY SET) and they want to also dictate where I spend those tokens.

	▲	verdverm 6 hours ago \| parent \| next [-]
		Those are subsidized tokens because you are also using their product. They have a per-token payment option where you can use any tool you like
	▲	charcircuit 6 hours ago \| parent \| prev [-]
		>I'm paying them for a given amount of tokens The plans do not say how many tokens you get. People are paying for access. Higher plans get more usage. The marketing and support material of the plans only use the word "usage" and never "tokens."