Remix.run Logo
impure 4 hours ago

I recently made an AI Agent and surprisingly coding with DeepSeek V4 Flash is quite cheap. It probably has to do with the aggressive prompt caching. I'm using OpenRouter with Novita AI as the preferred provider.

throwa356262 4 hours ago | parent | next [-]

Deepseek v4 via deepseek themselves is significantly cheaper.

Because (1) Huawei collab and (2) vLLM etc dont implement half of the inference optimisations deepseek proposed in their paper.

kagamino 4 hours ago | parent | prev [-]

Same here, deepseek v4 flash on opencode go. It's cheap, fats and good enough to follow my instructions

2muchtime 4 hours ago | parent [-]

I’m using zen because I have a Claude subscription and just like dabbling with the other models and I was shocked at how little flash cost but it was noticeably not at the level I’d like my model to be.

For me MiniMax 3 has really hit the sweet spot of being very cheap, though more than flash, but I’d also very capable.