|
| ▲ | simsla an hour ago | parent | next [-] |
| I do wonder if it's fair to expect users to absorb cache miss costs when using Claude Code given how untransparent these are. |
|
| ▲ | yummytummy 5 hours ago | parent | prev | next [-] |
| That might be, but the argument was that poor cache utilization was costing Anthropic too much money in other harnesses. If cache is considered in rate limits, it doesn’t matter from a cost perspective, you’ll just hit your rate limits faster in other harnesses that don’t try to cache optimize. |
| |
| ▲ | bcherny 5 hours ago | parent [-] | | There were two issues with some other 3p harnesses: 1. Poor cache utilization. I put up a few PRs to fix these in OpenClaw, but the problem is their users update to new versions very slowly, so the vast majority of requests continued to use cache inefficiently. 2. Spiky traffic. A number of these harnesses use un-jittered cron, straining services due to weird traffic shape. Same problem -- it's patched, but users upgrade slowly. We tried to fix these, but in the end, it's not something we can directly influence on users' behalf, and there will likely be more similar issues in the future. If people want to use these they are welcome to, but subscriptions clients need to be more efficient than that. | | |
| ▲ | SyneRyder 4 hours ago | parent | next [-] | | How much jitter would you prefer, how many seconds / minutes out? I have some morning tasks that run while I'm asleep via claude -p, and it sounds like I'm slightly contributing to your spikes (presumably hourly and on quarter hours). | | | |
| ▲ | dollspace 3 hours ago | parent | prev [-] | | If you give doll a list of things you want to see from third party harnesses, a compliance checklist it will make sure the one it is building follows it to the letter. |
|
|
|
| ▲ | eastbound 5 hours ago | parent | prev | next [-] |
| I’m sorry but when you wake up in the morning with 12% of your session used, saying “it’s the cache” is not an appropriate answer. And I’m using Claude on a small module in my project, the automations that read more to take up more context are a scam. |
|
| ▲ | beacon294 4 hours ago | parent | prev [-] |
| Politely, no. - I wrote an extension in Pi to warm my cache with a heartbeat. - I wrote another to block submission after the cache expired (heartbeats disabled or run out) - I wrote a third to hard limit my context window. - I wrote a fourth to handle cache control placement before forking context for fan out. - my initial prompt was 1000 tokens, improving cache efficiency. Anthropic is STOMPING on the diversity of use cases of their universal tool, see you when you recover. |