| ▲ | kelseyfrog 2 days ago | |||||||||||||
What's the hardware cost to running it? | ||||||||||||||
| ▲ | bbor 2 days ago | parent | next [-] | |||||||||||||
I was curious, and some [intrepid soul](https://wavespeed.ai/blog/posts/deepseek-v4-gpu-vram-require...) did an analysis. Assuming you do everything perfectly and take full advantage of the model's MoE sparsity, it would take: - To run at full precision: "16–24 H100s", giving us ~$400-600k upfront, or $8-12/h from [us-east-1](https://intuitionlabs.ai/articles/h100-rental-prices-cloud-c...). - To run with "heavy quantization" (16 bits -> 8): "8xH100", giving us $200K upfront and $4/h. - To run truly "locally"--i.e. in a house instead of a data center--you'd need four 4090s, one of the most powerful consumer GPUs available. Even that would clock in around $15k for the cards alone and ~$0.22/h for the electricity (in the US). Truly an insane industry. This is a good reminder of why datacenter capex from since 2023 has eclipsed the Manhattan Project, the Apollo program, and the US interstate system combined... | ||||||||||||||
| ||||||||||||||
| ▲ | redox99 2 days ago | parent | prev | next [-] | |||||||||||||
Probably like 100 USD/hour | ||||||||||||||
| ▲ | slashdave 2 days ago | parent | prev [-] | |||||||||||||
"if you have to ask..." | ||||||||||||||