I love the focus on cache hit efficiency. Hats off to the deekseek team for creating a great product that maximizes cost efficiency for the user.

▲

bwfan123 6 hours ago | parent | next [-]

> Hats off to the deekseek team for creating a great product

I have been using it for a while, and I wholeheartedly agree. imo, it is as good as codex or claude which I also use. It is a winner in the cost-sensitive tier, and if some startup could put it together with data-retention in mind, it could be a great product sold to the enterprise, as data-retention and privacy are the main issues for the coding-assistant usecase.

▲

chillfox 5 hours ago | parent [-]

Deepseek v4 pro is definitely my preferred cheap model, it's very good, and I use it all the time for my personal projects (opencode go plan), but I also use Claude Opus all the time at work and Deepseek is not as good as that, but it does compete with Sonnet for capability, and beats it on price.

	▲	pjerem 41 minutes ago \| parent [-]
		I have unlimited Claude Opus at work and it’s wonderful. Not allozwed to use it for personal use though. So I use Deepseek Pro on the $20 Ollama Cloud plan and it’s really not that far behind and I never triggered the plan’s limits. It’s like 10-15% less powerful but costs 10 times less. Totally worth it. I prefer Opus because my employer pays for it but I would personally never pay 10 times more for it.

▲

nicce 5 hours ago | parent | prev | next [-]

Just in case, note that this project is someone's side project

> Independent open-source project · not affiliated with DeepSeek

▲

Bombthecat 5 hours ago | parent | prev | next [-]

Adding already cheap API cost and you probably could let it run for days and the same task..

▲

stavros 6 hours ago | parent | prev [-]

How can you have cache hit efficiency? Isn't it just a matter of not changing the previous context? I don't understand what knobs there are to tweak on this.

▲

everforward 6 hours ago | parent [-]

> Isn't it just a matter of not changing the previous context?

Yes, but a lot of harnesses change previous context. E.g. the system prompt injects the current time/date, working directory, files in the working directory, etc. Compaction also changes the whole previous context. I _think_ changing the list of tools also invalidates cache, so invoking a subagent with different tools would invalidate the cache.

My vague impression is that it's in a similar vein to functional programming languages. It generally disallows doing things that lead to bugs (cache misses in this case), and presumably allows you to do those things in a way that makes it much clearer that this is likely to cause cache misses. I would guess that in this paradigm, you don't mutate your existing session, you derive a new session by mutating the prior context into a new context.

▲

chillfox 5 hours ago | parent [-]

changing between plan/build mode in some agents will change the tools list, which breaks the cache.

▲

brookst 5 hours ago | parent [-]

Cache is always there, it’s just that it only caches up to the point where an input token changes. So if the tools list is early in the prompt, changing it would limit cache for most of the prompt. If the tools list is the last thing, you could still get 99% cache hits even if it changes every turn.

	▲	RevEng 4 hours ago \| parent [-]
		After a couple of turns the system prompt is a small part of the context. Not changing the system prompt at all is key so that the rest of the history is itself part of the prefix.