It's not unsolved, at least not the first part of your question. In fact it is a feature offered by all main LLM providers!

- https://platform.openai.com/docs/guides/prompt-caching

- https://platform.claude.com/docs/en/build-with-claude/prompt...

- https://ai.google.dev/gemini-api/docs/caching

▲

igravious an hour ago | parent | next [-]

dumb question, but is prompt caching available to Claude Code … ?

▲

imiric 3 hours ago | parent | prev [-]

Ah, that's good to know, thanks.

But then why is there compounding token usage in the article's trivial solution? Is it just a matter of using the cache correctly?

▲

StevenWaterman 3 hours ago | parent [-]

Cached tokens are cheaper (90% discount ish) but not free

▲

moyix 2 hours ago | parent [-]

Also, unlike OpenAI, Anthropic's prompt caching is explicit (you set up to 4 cache "breakpoints"), meaning if you don't implement caching then you don't benefit from it.

	▲	netcraft 2 hours ago \| parent [-]
		thats a very generous way of putting it. Anthropic's prompt caching is actively hostile and very difficult to implement properly.