| ▲ | kang 6 hours ago | |
> tokens written to cache all at once, which would eat up a significant % of your rate limits Construction of context is not an llm pass - it shouldn't even count towards token usage. The word 'caching' itself says don't recompute me. Since the devs on HN (& the whole world) is buying what looks like nonsense to me - what am I missing? | ||