| ▲ | esafak 15 hours ago | |||||||
It's open source; the price is up to the provider, and I do not see any on openrouter yet. ̶G̶i̶v̶e̶n̶ ̶t̶h̶a̶t̶ ̶d̶e̶v̶s̶t̶r̶a̶l̶ ̶i̶s̶ ̶m̶u̶c̶h̶ ̶s̶m̶a̶l̶l̶e̶r̶,̶ ̶I̶ ̶c̶a̶n̶ ̶n̶o̶t̶ ̶i̶m̶a̶g̶i̶n̶e̶ ̶i̶t̶ ̶w̶i̶l̶l̶ ̶b̶e̶ ̶m̶o̶r̶e̶ ̶e̶x̶p̶e̶n̶s̶i̶v̶e̶,̶ ̶l̶e̶t̶ ̶a̶l̶o̶n̶e̶ ̶5̶x̶.̶ ̶I̶f̶ ̶a̶n̶y̶t̶h̶i̶n̶g̶ ̶D̶e̶e̶p̶S̶e̶e̶k̶ ̶w̶i̶l̶l̶ ̶b̶e̶ ̶5̶x̶ ̶t̶h̶e̶ ̶c̶o̶s̶t̶.̶ edit: Mea culpa. I missed the active vs dense difference. | ||||||||
| ▲ | NitpickLawyer 14 hours ago | parent | next [-] | |||||||
> Given that devstral is much smaller, I can not imagine it will be more expensive Devstral 2 is 123B dense. Deepseek is 37B Active. It will be slower and more expensive to run inference on this than dsv3. Especially considering that dsv3.2 has some goodies that make inference at higher context be more effective than their previous gen. | ||||||||
| ||||||||
| ▲ | aimanbenbaha 11 hours ago | parent | prev | next [-] | |||||||
Deepseek v3.2 is that cheap because its attention mechanism is ridiculously efficient. | ||||||||
| ||||||||
| ▲ | 6 hours ago | parent | prev [-] | |||||||
| [deleted] | ||||||||