| ▲ | charcircuit 7 hours ago | |||||||||||||
You are making the false assumption that all token consumption costs the same when it doesn't. Yes in the limit the price to serve the model and generate a response is O(tokens), but when tokens is smaller it can be cheaper to generate a new token than when tokens is bigger. If other harnesses prompt with more tokens than Claude Code it can be more expensive to serve. | ||||||||||||||
| ▲ | stavros 6 hours ago | parent [-] | |||||||||||||
They have limits. I don't care how expensive it is to serve, I'm paying them for a given amount of tokens (a limit which THEY SET) and they want to also dictate where I spend those tokens. | ||||||||||||||
| ||||||||||||||