| ▲ | Denzel a day ago | ||||||||||||||||
Uhm, you actually just proved their point if you run the numbers. For simplicity’s sake we’ll assume DeepSeek 671B on 2 RTX 5090 running at 2 kW full utilization. In 3 years you’ve paid $30k total: $20k for system + $10k in electric @ $0.20/kWh The model generates 500M-1B tokens total over 3 years @ 5-10 tokens/sec. Understand that’s total throughput for reasoning and output tokens. You’re paying $30-$60/Mtok - more than both Opus 4.5 and GPT-5.2, for less performance and less features. And like the other commenters point out, this doesn’t even factor in the extra DC costs when scaling it up for consumers, nor the costs to train the model. Of course, you can play around with parameters of the cost model, but this serves to illustrate it’s not so clear cut whether the current AI service providers are profitable or not. | |||||||||||||||||
| ▲ | kingstnap 20 hours ago | parent | next [-] | ||||||||||||||||
5 to 10 tokens per second is bungus tier rates. https://developer.nvidia.com/blog/nvidia-blackwell-delivers-... NVIDIAs 8xB200 gets you 30ktps on Deepseek 671B at maximum utilization thats 1 trillion tokens per year. At a dollar per million tokens that's $1 million. The hardware costs around $500k. Now ideal throughput is unlikely, so let's say your get half that. It's still 500B tokens per year. Gemini 3 Flash is like $3/million tokens and I assume it's a fair bit bigger, maybe 1 to 2T parameters. I can sort of see how you can get this to work with margins as the AI companies repeated assert. | |||||||||||||||||
| |||||||||||||||||
| ▲ | 13 hours ago | parent | prev [-] | ||||||||||||||||
| [deleted] | |||||||||||||||||