| Sure, assuming the power cost reduction or capability increase justifies the expenditure. It's not clear that that will be the case. That's one of the shaky assumptions I'm referring to. It may be that the 2030 nvidia accelerators will save you $2000 in electricity per month per rack, and you can upgrade the whole rack for the low, low price of $800,000! That may not be worth it at all. If it saves you $200k/per rack or unlocks some additional capability that a 2025 accelerator is incapable of and customers are willing to pay for, then that's a different story. There are a ton of assumptions in these scenarios, and his logic doesn't seem to justify the confidence level. |
| |
| ▲ | overfeed 8 hours ago | parent | next [-] | | > Sure, assuming the power cost reduction or capability increase justifies the expenditure. It's not clear that that will be the case. Share price is a bigger consideration than any +/- differences[1] between expenditure vs productivity delta. GAAP allows some flexibility in how servers are depreciated, so depending on what the company wants to signal to shareholders (investing in infra for futur returns vs curtailing costs), it may make sense to shorten or lengthen depreciation time regardless of the actual TCOO keep/refresh cost comparisons. 1. Hypothetical scenario: a hardware refresh costs $80B, actual performance increase is only worth $8B, but the share price increases the value of org's holding of its own shares by $150B. As a CEO/CFO, which action would you recommend- without even considering your own bonus that's implicitly or explicitly tied to share price performance. | |
| ▲ | maxglute 14 hours ago | parent | prev | next [-] | | Demand/suppy economics is not so hypothetical. Illustration numbers: AI demand premium = $150 hardware with $50 electricity. Normal demand = $50 hardware with $50 electricity. This is Nvidia margins @75% instead of 40%. CAPEX/OPEX is 70%/20% hardware/power instead of customary 50%/40%. If bubble crashes, i.e. AI demand premium evaporates, we're back at $50 hardware and $50 electricity. Likely $50 hardware and $25 electricity if hardware improves. Nvdia back to 30-40% margins, operators on old hardware stuck with stranded assets. The key thing to understand is current racks are sold at grossly inflated premiums right now, scarcity pricing/tax. If the current AI economic model doesn't work then fundmentally that premium goes away and subsequent build outs are going to be costplus/commodity pricing = capex discounted by non trivial amounts. Any breakthroughs in hardware, i.e. TPU compute efficiency would stack opex (power) savings. Maybe by year 8, first gen of data centers are still depreciated to $80 hardware + $50 power vs new center @ $50 hardware + $25 power. That old data center is a massive write-down because it will generate less revenue than it costs to amoritize. | |
| ▲ | trollbridge 16 hours ago | parent | prev [-] | | A typical data centre is $2,500 per year per kW load (including overhead, hvac and so on). If it costs $800,000 to replace the whole rack, then that would pay off in a year if it reduces 320 kW of consumption. Back when we ran servers, we wouldn't assume 100% utilisation but AI workloads do do that; normal server loads would be 10kW per rack and AI is closer to 100. So yeah, it's not hard to imagine power savings of 3.2 racks being worth it. | | |
| ▲ | Octoth0rpe 15 hours ago | parent [-] | | Thanks for the numbers! Isn't it more likely that the amount of power/heat generated per rack will stay constant over each upgrade cycle, and the upgrade simply unlocks a higher amount of service revenue per rack? | | |
| ▲ | PunchyHamster 11 hours ago | parent | next [-] | | Not in the last few years. CPUs went from ~200W TDP to 500W. And they went from zero to multiple GPUs per server. Tho we might hit "the chips can't be bigger and the cooling can't get much better" point there. The usage would be similar if it was say a rack filled with servers full of bulk storage (hard drives generally keep the power usage similar while growing storage). But CPU/GPU wise, it's just bigger chips/more chiplets, more power. I'd imagine any flattening might be purely because "we have DC now, re-building cooling for next gen doesn't make sense so we will just build servers with similar power usage as previously", but given how fast AI pushed the development it might not happen for a while. | |
| ▲ | toast0 10 hours ago | parent | prev [-] | | > Isn't it more likely that the amount of power/heat generated per rack will stay constant over each upgrade cycle, Power density seems to grow each cycle. But eventually your DC hits power capacity limits, and you have to leave racks empty because there's no power budget. |
|
|
|