Remix.run Logo
naaqq 2 hours ago

DeepSeek’s official API has a cache hit rate of over 99% if you use it continuously within the same codebase for long sessions, so it’s much cheaper than frontier models. I have an example of 200M token session in claude code.

halfwhey an hour ago | parent [-]

Might be a dumb question but do you have to read the files in the same order in new sessions to ensure the correct prefix for the cache?

WatchDog 30 minutes ago | parent | next [-]

Yes, you have to use the same session, I guess you could load up a bunch of context, then fork the session into a few different tasks, although I haven't tried it.

weiliddat 34 minutes ago | parent | prev | next [-]

Also curious. With tool calls reading/searching different files, possible compacting reading a large codebase / long threads, I can't imagine how you hit 99% cache rate.

naaqq 33 minutes ago | parent | prev | next [-]

Sorry, I was wrong here. I meant a single long session. And there’s no compression, the 1M context is only half used.

johndough 31 minutes ago | parent | prev [-]

> do you have to read the files in the same order in new sessions to ensure the correct prefix for the cache?

No, you do not need to re-read files. The entire conversation history is sent to the server for every prompt, so it already contains a copy of the files.