Remix.run Logo
Octoth0rpe 16 hours ago

> Krishna also referenced the depreciation of the AI chips inside data centers as another factor: "You've got to use it all in five years because at that point, you've got to throw it away and refill it," he said

This doesn't seem correct to me, or at least is built on several shaky assumptions. One would have to 'refill' your hardware if:

- AI accelerator cards all start dying around the 5 year mark, which is possible given the heat density/cooling needs, but doesn't seem all that likely.

- Technology advances such that only the absolute newest cards can be used to run _any_ model profitably, which only seems likely if we see some pretty radical advances in efficiency. Otherwise, it seems like assuming your hardware is stable after 5 years of burn in, you could continue to run older models on that hardware at only the cost of the floorspace/power. Maybe you need new cards for new models for some reason (maybe a new fp format that only new cards support? some magic amount of ram? etc), but it seems like there may be room for revenue via older/less capable models at a discounted rate.

darth_avocado 9 hours ago | parent | next [-]

Isn’t that what Michael Burry is complaining about? That five years is actually too generous when it comes to depreciation of these assets and that companies are being too relaxed with that estimate. The real depreciation is more like 2-3 years for these GPUs that cost tens of thousands of dollars a piece.

https://x.com/michaeljburry/status/1987918650104283372

enopod_ 2 hours ago | parent | next [-]

That's exactly the thing. It's only about bookkeeping.

The big AI corps keep pushing depreciation for GPUs into the future, no matter how long the hardware is actually useful. Some of them are now at 6 years. But GPUs are advancing fast, and new hardware brings more flops per watt, so there's a strong incentive to switch to the latest chips. Also, they run 24/7 at 100% capacity, so after only 1.5 years, a fair share of the chips is already toast. How much hardware do they have in their books that's actually not useful anymore? Noone knows! Slower depreciation means more profit right now (for those companies that actually make profit, like MS or Meta), but it's just kicking the can down the road. Eventually, all these investments have to get out of the books, and that's where it will eat their profits. In 2024, the big AI corps invested about $1 trillion in AI hardware, next year is expected to be $2 trillion. Only the interest payments for that are crazy. And all of this comes on top of the fact that none of the these companies actually make any profit at all with AI. (Except Nvidia of course) There's just no way this will pan out.

duped 7 hours ago | parent | prev [-]

How different is this from rental car companies changing over their fleets? I don't know, this is a genuine question. The cars cost 3-4x as much and last about 2x as far as I know, and the secondary market is still alive.

logifail 4 hours ago | parent | next [-]

> How different is this from rental car companies changing over their fleets?

New generations of GPUs leapfrog in efficiency (performance per watt) and vehicles don't? Cars don't get exponentially better every 2–3 years, meaning the second-hand market is alive and well. Some of us are quite happy driving older cars (two parked outside our home right now, both well over 100,000km driven).

If you have a datacentre with older hardware, and your competitor has the latest hardware, you face the same physical space constraints, same cooling and power bills as they do? Except they are "doing more" than you are...

Would we could call it "revenue per watt"?

wongarsu an hour ago | parent | next [-]

The traditional framing would be cost per flop. At some point your total costs per flop over the next 5 years will be lower if you throw out the old hardware and replace it with newer more efficient models. With traditional servers that's typically after 3-5 years, with GPUs 2-3 years sounds about right

The major reason companies keep their old GPUs around much longer with now are the supply constraints

bbarnett 2 hours ago | parent | prev [-]

The used market is going to be absolutely flooded with millions of old cards. I imagine shipping being the most expensive cost for them. The supply side will be insane.

Think 100 cards but only 1 buyer as a ratio. Profit for ebay sellers will be on "handling", or inflated shipping costs.

eg shipping and handling.

3form an hour ago | parent [-]

I assume NVIDIA and co. already protects themselves in some way, either by the fact of these cards not being very useful after resale, or requiring them to go to the grinder after they expire.

bbarnett 31 minutes ago | parent [-]

Cards don't "expire". There are alternate strategies to selling cards, but if they don't sell the cards, then there is no transfer of ownership, and therefore NVIDIA is entering some form of leasing model.

If NVIDIA is leasing, then you can't get use those cards as collateral. You can't also write off depreciation. Part of what we're discussing is that terms of credit are being extended too generously, with depreciation in the mix.

The could require some form of contractual arrangement, perhaps volume discounts for cards, if they agree to destroy them at a fixed time. That's very weird though, and I've never heard of such a thing for datacenter gear.

They may protect themselves on the driver side, but someone could still write OSS.

afavour 6 hours ago | parent | prev | next [-]

Rental car companies aren’t offering rentals at deep discount to try to kickstart a market.

It would be much less of a deal if these companies were profitable and could cover the costs of renewing hardware, like car rental companies can.

cjonas 6 hours ago | parent | prev | next [-]

I think it's a bit different because a rental car generates direct revenue that covers its cost. These GPU data centers are being used to train models (which themselves quickly become obsolete) and provide inference at a loss. Nothing in the current chain is profitable except selling the GPUs.

sho 5 hours ago | parent [-]

> and provide inference at a loss

You say this like it's some sort of established fact. My understanding is the exact opposite and that inference is plenty profitable - the reason the companies are perpetually in the red is that they're always heavily investing in the next, larger generation.

I'm not Anthropic's CFO so i can't really prove who's right one way or the other, but I will note that your version relies on everyone involved being really, really stupid.

elktown 3 hours ago | parent | next [-]

“like it's some sort of established fact” -> “My understanding”?! a.k.a pure speculation. Some of you AI fans really need to read your posts out loud before posting them.

teodosin 3 hours ago | parent [-]

You misread the literal first snippet you quoted. There's no contradiction in what you replied to.

elktown 2 hours ago | parent [-]

No?

darkwater 4 hours ago | parent | prev | next [-]

The current generation of today was the next generation of yesterday. So, unless the services sold on inference can cover the cost of inference + training AND gain money, they are still operating at loss.

rvba 2 hours ago | parent | prev [-]

Or just "everyone" being greedy

chii 6 hours ago | parent | prev [-]

> the secondary market is still alive.

this is the crux. Will these data center cards, if a newer model came out with better efficiency, have a secondary market to sell to?

It could be that second hand ai hardware going into consumers' hands is how they offload it without huge losses.

vesrah 6 hours ago | parent | next [-]

The GPUs going into data centers aren't the kind that can just be reused by putting them into a consumer PC and playing some video games, most don't even have video output ports and put out FPS similar to cheap integrated GPUs.

geerlingguy 5 hours ago | parent [-]

And the big ones don't even have typical PCIe sockets, they are useless outside of behemoth rackmount servers requiring massive power and cooling capacity that even well-equipped homelabs would have trouble providing!

physicsguy 5 hours ago | parent | prev | next [-]

Data centre cards a don’t have fans and don’t have video out these days.

chii 5 hours ago | parent [-]

i dont mean consumer market for video cards - i mean a consumer buying ai chips to run themselves so they can have it locally.

If i can buy a $10k ai card for less than $5000 dollars, i probably would, if i can use it to run an open model myself.

mike_hearn an hour ago | parent | next [-]

Why would you do that when you can pay someone else to run the model for you on newer more efficient and more profitable hardware? What makes it profitable for you and not for them?

mkjs 4 hours ago | parent | prev | next [-]

At that point it isn't a $10k card anymore, it's a $5k card. And possibly not a $5k card for very long in the scenario that the market has been flooded with them.

darkwater 4 hours ago | parent | prev | next [-]

How many "yous" are there in the world? Probably a number that can buy what's inside one Azure DC?

physicsguy 5 hours ago | parent | prev | next [-]

Ah well yes to a degree that’s possible but at least at the moment you’d still be better off buying a $5k Mac Studio if it’s just inference you’re doing

esseph 2 hours ago | parent | prev [-]

You need the hardware to wrap that in, and the power draw is going to be... significant.

5 hours ago | parent | prev [-]
[deleted]
slashdave 11 hours ago | parent | prev | next [-]

5 years is long, actually. This is not a GPU thing. It's standard for server hardware.

bigwheels 10 hours ago | parent | next [-]

Because usually it's more efficient for companies to retire the hardware and put in new stuff.

Meanwhile, my 10-15 year old server hardware keeps chugging along just fine in the rack in my garage.

rsynnott 2 hours ago | parent | next [-]

"Just fine". Presumably you're not super-concerned with the energy costs? People who run data centres pretty much _have_ to be.

slashdave 9 hours ago | parent | prev | next [-]

More than that. The equipment is depreciated on a 5 year schedule on the company balance sheet. It actually costs nothing to discard it.

johncolanduoni 9 hours ago | parent [-]

There’s no marginal tax impact of discarding it or not after 5 years - if it was still net useful to keep it powered, they would keep it. Depreciation doesn’t demand you dispose of or sell the item to see the tax benefit.

mattmaroon 8 hours ago | parent [-]

No but it tips the scales. If the new hardware is a little more efficient, but perhaps not so much so that you would necessarily replace it, the ability to appreciate the new stuff, but not the old stuff might tip your decision

AdrianB1 9 hours ago | parent | prev | next [-]

I thought the same until I calculated that newer hardware consumes a few times less energy and for something running 24x7 that adds up quite a bit (I live in Europe, energy is quite expensive).

So my homelab equipment is just 5 years old and it will get replaced in 2-3 years with something even more power efficient.

tharkun__ 6 hours ago | parent | next [-]

Where in Europe?

Asking coz I just did a quick comparison and it seems to depend but for comparison I have a really old AMD Athlon "e" processor (like literally September 2009 is when it came out according to some quick Google search, tho I probably bought it a few months later than that but still ...) that runs at ~45W TDP. In idle conditions, it typically consumes around 10 to 15 watts (internet wisdom, not kill-a-watt-wisdom).

Some napkin math says it would cost me about 40 years worth of amortization to replace this at my current power rates for this system. So why would I replace it? And even with some EU countries' power rates we seem to be at 5-10 years amortization upon replacement. I've been running this motherboard, CPU + RAM combo for ~15 years now it seems, replacing only the hard drives every ~3 years. And the tower it's in is about 25 years old.

Oh I forgot, I think I had to buy two new CR2032 batteries during those years (CMOS battery).

Now granted, this processor can basically do "nothing" in comparison to a current system I might buy. But I also don't need more for what it does.

z0mghii 6 hours ago | parent [-]

Well if you have a system that does "nothing" it's hard to argue to replace it

bbarnett 2 hours ago | parent [-]

"Nothing" from parent was a comparison. Doesn't mean their system is idle.

However many systems are mostly idle. A file server often doesn't use much cpu. It often isn't even serving anything.

prmoustache 4 hours ago | parent | prev | next [-]

I guess you did the math but wouldn't it be more effective to spend the money on solar panels instead of replacing the computer hardware?

thehappypm 6 hours ago | parent | prev [-]

Energy is very cheap for data centers. have you ever looked up wholesale energy rates? It’s like a cent per kilowatt hour.

XorNot 10 hours ago | parent | prev [-]

Sample size of 1 though. It's like how I've had hard disks last a decade, but a 100 node Hadoop cluster had 3 die per week after a few years.

snuxoll 10 hours ago | parent [-]

Spinning rust and fans are the outliers when it comes to longevity in compute hardware. I’ve had to replace a disk or two in my rack at home, but at the end of the day the CPUs, RAM, NICs, etc. all continue to tick along just fine.

When it comes to enterprise deployments, the lifecycle always revolves around price/performance. Why pay for old gear that sucks up power and runs 30% slower than the new hotness, after all!

But, here we are, hitting limits of transistor density. There’s a reason I still can’t get 13th or 14th gen poweredge boxes for the price I paid for my 12th gen ones years ago.

matt-p 10 hours ago | parent | prev [-]

5 years is a long time for GPUs maybe but normal servers have 7 year lifespans in many cases fwiw.

These GPUs I assume basically have potential longevity issues due to the density, if you could cool it really really well I imagine no problem.

atherton94027 9 hours ago | parent | next [-]

> normal servers have 7 year lifespans in many cases fwiw

Eight years if you use Hetzner servers!

slashdave 9 hours ago | parent | prev [-]

Normal servers are rarely run flat-out. These GPUs are supposed to be run that way. So, yeah, age is going to be a problem, as will cooling.

abraae 16 hours ago | parent | prev | next [-]

It's just the same dynamic as old servers. They still work fine but power costs make them uneconomical compared to latest tech.

acdha 16 hours ago | parent | next [-]

It’s far more extreme: old servers are still okay on I/O, and memory latency, etc. won’t change that dramatically so you can still find productive uses for them. AI workloads are hyper-focused on a single type of work and, unlike most regular servers, a limiting factor in direct competition with other companies.

matt-p 10 hours ago | parent [-]

I mean you could use training GPUs for inference right? That would be use case number 1 for a 8 * a100 box in a couple of years. It can also be used for non IO limited things like folding proteins or other 'scientific' use cases. Push comes to shove im sure an old A100 will run crysis.

physicsguy 5 hours ago | parent | next [-]

> Push comes to shove im sure an old A100 will run crysis.

They don’t have video out ports!

2 hours ago | parent | next [-]
[deleted]
fulafel 4 hours ago | parent | prev [-]

Just like laptop dGPUs.

oblio 7 hours ago | parent | prev [-]

All those use cases would probably use up 1% of the current AI infrastructure, let alone ahat they're planning to build.

Yeah, just like gas, possible uses will expand if AI crashes out, but:

* will these uses cover, say, 60% of all this infra?

* will these uses scale up to use that 60% within the next 5-7 years, while that hardware is still relevant and fully functional?

Also, we still have railroad tracks from the 1800s rail mania that were never truly used to capacity and dot com boom dark fiber that's also never been used fully, even with the internet growing 100x since. And tracks and fiber don't degrade as quickly as server hardware and especially GPUs.

m00x 10 hours ago | parent | prev | next [-]

LambdaLabs is still making money off their Tesla V100s, A100s, and A6000s. The older ones are cheap enough to run some models and very cheap, so if that's all you need, that's what you'll pick.

The V100 was released in 2017, A6000 in 2020, A100 in 2021.

Havoc 14 hours ago | parent | prev | next [-]

That could change with a power generation breakthrough. If power is very cheap then running ancient gear till it falls apart starts making more sense

rgmerk 4 hours ago | parent | next [-]

Hugely unlikely.

Even if the power is free you still need a grid connection to move it to where you need it, and, guess what, the US grid is bursting at the seams. This is not just due to data center demand; it was struggling to cope with the transition away from coal well before that point.

You also can’t buy a gas turbine for love nor money at the moment, and they’re not ever going to be free.

If you plonked massive amounts of solar panels and batteries in the Nevada desert, that’s becoming cheap but it ain’t free, particularly as you’ll still need gas backup for a string of cloudy days.

If you think SMRs are going to be cheap I have a bridge to sell you, you’re also not going to build them right next to your data centre because the NRC won’t let you.

So that leaves fusion or geothermal. Geothermal is not presently “very cheap” and fusion power has not been demonstrated to work at any price.

overfeed 9 hours ago | parent | prev [-]

Power consumption is only part of the equation. More efficient chips => less heat => lower cooling costs and/or higher compute density in the same space.

nish__ 8 hours ago | parent [-]

Solution: run them in the north. Put a server in the basement of every home in Edmonton and use the excess heat to warm the house.

zppln 16 hours ago | parent | prev | next [-]

I'm a little bit curious about this. Where do all the hardware from the big tech giants usually go once they've upgraded?

q3k 11 hours ago | parent | next [-]

In-house hyperscaler stuff gets shredded, after every single piece of flash storage gets first drilled through and every hard drive gets bent by a hydraulic press. Then it goes into the usual e-waste recycling stream (ie. gets sent to poor countries where precious metals get extracted by people with a halved life expectancy).

Off-the-shelf enterprise gear has a chance to get a second life through remarketing channels, but much of it also gets shredded due to dumb corporate policies. There are stories of some companies refusing to offload a massive decom onto the second hand market as it would actually cause a crash. :)

It's a very efficient system, you see.

oblio 7 hours ago | parent [-]

Similar to corporate laptops where due to stupid policies, for most BigCos you can't really buy or otherwise get a used laptop, even as the former corporate used of said laptop.

Super environmentally friendly.

trollbridge 16 hours ago | parent | prev | next [-]

I used (relatively) ancient servers (5-10 years in age) because their performance is completely adequate; they just use slightly more power. As a plus it's easy to buy spare parts, and they run on DDR3, so I'm not paying the current "RAM tax". I generally get such a server, max out its RAM, max out its CPUs and put it to work.

taneq 11 hours ago | parent [-]

Same, the bang for buck on a 5yo server is insane. I got an old Dell a year ago (to replace our 15yo one that finally died) and it was $1200 AUD for a maxed out recently-retired server with 72TB of hard drives and something like 292GB of RAM.

PunchyHamster 11 hours ago | parent [-]

Just not too old. Easy to get into "power usage makes it not worth it" for any use case when it runs 24/7

monster_truck 10 hours ago | parent | next [-]

Seriously. 24/7 adds up faster than most realize!

The idle wattage per module has shrunk from 2.5-3W down to 1-1.2 between DDR3 & DDR5. Assuming a 1.3W difference (so 10.4W for 8760 hours), a DDR3 machine with 8 sticks would increase your yearly power consumption by almost 1% (assuming avg 10,500kWh/yr household)

That's only a couple dollars in most cases but the gap is only larger in every other instance. When I upgraded from Zen 2 to Zen 3 it was able to complete the same workload just as fast with half as many cores while pulling over 100W less. Sustained 100% utilization barely even heats a room effectively anymore!

blackenedgem 2 hours ago | parent | next [-]

The one thing to be careful with Zen 2 onwards is that if your server is going to be idling most of the time then the majority of your power usage comes from the IO die. Quite a few times you'd be better off with the "less efficient" Intel chips because they save 10-20 Watts when doing nothing.

3 hours ago | parent | prev | next [-]
[deleted]
nish__ 8 hours ago | parent | prev [-]

Wake on LAN?

darkwater 3 hours ago | parent [-]

Then you cannot enjoy some very useful and used home server functions like home automation or NVR.

dpe82 10 hours ago | parent | prev | next [-]

Maybe? The price difference on newer hardware can buy a lot of electricity, and if you aren't running stuff at 100% all the time the calculation changes again. Idle power draw on a brand new server isn't significantly different from one that's 5 years old.

taneq 3 hours ago | parent | prev [-]

To be clear, this server is very lightly loaded, it's just running our internal network services (file server, VPN/DNS, various web apps, SVN etc.) so it's not like we're flogging a room full of GeForce 1080Ti cards instead of buying a new 4090Ti or whatever. Also it's at work so it doesn't impact the home power bill. :D

wmf 16 hours ago | parent | prev [-]

Some is sold on the used market; some is destroyed. There are plenty of used V100 and A100 available now for example.

dogman144 15 hours ago | parent | prev | next [-]

Manipulating this for creative accounting seems to be the root of Michael Burry’s argument, although I’m not fluent enough in his figures to map here. But, commenting that it interesting to see IBM argue a similar case (somewhat), or comments ITT hitting the same known facts, in light of Nvidia’s counterpoints to him.

PunchyHamster 11 hours ago | parent | prev | next [-]

Eh, not exactly. If you don't run CPU at 70%+ the rest of the machine isn't that much more inefficient that model generation or two behind.

It used to be that new server could use half power of the old one at idle but vendors figured out that servers also need proper power management a while ago and it is much better.

Last few gens increase could be summed up to "low % increase in efficiency, with TDP, memory channels and core count increase".

So for loads not CPU bound the savings on newer gen aren't nearly worth it to replace it, and for bulk storage the CPU power usage is even smaller part

matt-p 10 hours ago | parent [-]

Definitely single thread performance and storage are the main reasons not to use an old server. A 6 year old server didn't have nvme drives, so SATA SSD at best. That's a major slow down if disk is important.

Aside from that there's no reason to not use a dual socket server from 5 years ago instead of a single socket server of today. Power and reliability maybe not as good.

zozbot234 5 hours ago | parent [-]

NVMe is just a different form factor for what's essentially a PCIe connection, and adapters are widely available to bridge these formats. Surely old servers will still support PCIe?

knowitnone3 10 hours ago | parent | prev [-]

that was then. now, high-end chips are reaching 4,3,2 nm. power savings aren't that high anymore. what's the power saving going from 4 to 2nm?

monster_truck 10 hours ago | parent | next [-]

+5-20% clockspeed at 5-25% lower voltages (which has been and continues to be the trend) add up quick from gen to gen, nevermind density or ipc gains.

baq 3 hours ago | parent [-]

We can’t really go lower on voltage anymore without a very significant change in the materials used. Silicon band gap yadda yadda.

10 hours ago | parent | prev [-]
[deleted]
rzerowan 15 hours ago | parent | prev | next [-]

I think its illustrative to consider the previous computation cycle ala Cryptomining. Which passed through a similar lifecycle with energy and GPU accelerators.

The need for cheap wattage forced the operations to arbitrage the where location for the cheapest/reliable existing supply - there rarely was new buildout as the cost was to be reimbursed by the coins the miningpool recovered.

For the chip situation caused the same apprecaition in GPU cards with periodic offloading of cards to the secondary market (after wear and tear) as newer/faster/more efficient cards came out until custom ASICs took over the heavy lifting, causing the GPU card market to pivot.

Similarly in the short to moedium term the uptick of custo ASICs like with Google TPU will definately make a dent in bot cpex/opex and potentially also lead to a market with used GPUs as ASICs dominate.

So for GPUs i can certainly see the 5 year horizon making a impact in investment decisions as ASICs proliferate.

mcculley 16 hours ago | parent | prev | next [-]

But if your competitor is running newer chips that consume less power per operation, aren't you forced to upgrade as well and dispose of the old hardware?

Octoth0rpe 16 hours ago | parent | next [-]

Sure, assuming the power cost reduction or capability increase justifies the expenditure. It's not clear that that will be the case. That's one of the shaky assumptions I'm referring to. It may be that the 2030 nvidia accelerators will save you $2000 in electricity per month per rack, and you can upgrade the whole rack for the low, low price of $800,000! That may not be worth it at all. If it saves you $200k/per rack or unlocks some additional capability that a 2025 accelerator is incapable of and customers are willing to pay for, then that's a different story. There are a ton of assumptions in these scenarios, and his logic doesn't seem to justify the confidence level.

overfeed 8 hours ago | parent | next [-]

> Sure, assuming the power cost reduction or capability increase justifies the expenditure. It's not clear that that will be the case.

Share price is a bigger consideration than any +/- differences[1] between expenditure vs productivity delta. GAAP allows some flexibility in how servers are depreciated, so depending on what the company wants to signal to shareholders (investing in infra for futur returns vs curtailing costs), it may make sense to shorten or lengthen depreciation time regardless of the actual TCOO keep/refresh cost comparisons.

1. Hypothetical scenario: a hardware refresh costs $80B, actual performance increase is only worth $8B, but the share price increases the value of org's holding of its own shares by $150B. As a CEO/CFO, which action would you recommend- without even considering your own bonus that's implicitly or explicitly tied to share price performance.

maxglute 14 hours ago | parent | prev | next [-]

Demand/suppy economics is not so hypothetical.

Illustration numbers: AI demand premium = $150 hardware with $50 electricity. Normal demand = $50 hardware with $50 electricity. This is Nvidia margins @75% instead of 40%. CAPEX/OPEX is 70%/20% hardware/power instead of customary 50%/40%.

If bubble crashes, i.e. AI demand premium evaporates, we're back at $50 hardware and $50 electricity. Likely $50 hardware and $25 electricity if hardware improves. Nvdia back to 30-40% margins, operators on old hardware stuck with stranded assets.

The key thing to understand is current racks are sold at grossly inflated premiums right now, scarcity pricing/tax. If the current AI economic model doesn't work then fundmentally that premium goes away and subsequent build outs are going to be costplus/commodity pricing = capex discounted by non trivial amounts. Any breakthroughs in hardware, i.e. TPU compute efficiency would stack opex (power) savings. Maybe by year 8, first gen of data centers are still depreciated to $80 hardware + $50 power vs new center @ $50 hardware + $25 power. That old data center is a massive write-down because it will generate less revenue than it costs to amoritize.

trollbridge 16 hours ago | parent | prev [-]

A typical data centre is $2,500 per year per kW load (including overhead, hvac and so on).

If it costs $800,000 to replace the whole rack, then that would pay off in a year if it reduces 320 kW of consumption. Back when we ran servers, we wouldn't assume 100% utilisation but AI workloads do do that; normal server loads would be 10kW per rack and AI is closer to 100. So yeah, it's not hard to imagine power savings of 3.2 racks being worth it.

Octoth0rpe 15 hours ago | parent [-]

Thanks for the numbers! Isn't it more likely that the amount of power/heat generated per rack will stay constant over each upgrade cycle, and the upgrade simply unlocks a higher amount of service revenue per rack?

PunchyHamster 11 hours ago | parent | next [-]

Not in the last few years. CPUs went from ~200W TDP to 500W.

And they went from zero to multiple GPUs per server. Tho we might hit "the chips can't be bigger and the cooling can't get much better" point there.

The usage would be similar if it was say a rack filled with servers full of bulk storage (hard drives generally keep the power usage similar while growing storage).

But CPU/GPU wise, it's just bigger chips/more chiplets, more power.

I'd imagine any flattening might be purely because "we have DC now, re-building cooling for next gen doesn't make sense so we will just build servers with similar power usage as previously", but given how fast AI pushed the development it might not happen for a while.

toast0 10 hours ago | parent | prev [-]

> Isn't it more likely that the amount of power/heat generated per rack will stay constant over each upgrade cycle,

Power density seems to grow each cycle. But eventually your DC hits power capacity limits, and you have to leave racks empty because there's no power budget.

HWR_14 15 hours ago | parent | prev [-]

It depends on how much profit you are making. As long as you can still be profitable on the old hardware you don't have to upgrade.

AstroBen 10 hours ago | parent [-]

That's the thing though: a competitor with better power efficiency can undercut you and take your customers

tzs 8 hours ago | parent [-]

Or they could charge the same as you and make more money per customer. If they already have as many customers as they can handle doing that may be better than buying hardware to support a larger number of customers.

austin-cheney 16 hours ago | parent | prev | next [-]

It’s not about assumptions on the hardware. It’s about the current demands for computation and expected growth of business needs. Since we have a couple years to measure against it should be extremely straightforward to predict. As such I have no reason to doubt the stated projections.

9cb14c1ec0 11 hours ago | parent | next [-]

> Since we have a couple years to measure against

Trillion pound baby fallacy.

lumost 10 hours ago | parent | prev | next [-]

Networking gear was famously overbought. Enterprise hardware is tricky as there isn’t much of a resale market for this gear once all is said and done.

The only valid use case for all of this compute which could reasonably replace ai is btc mining. I’m uncertain if the increased mining capacity would harm the market or not.

piva00 2 hours ago | parent [-]

BTC mining on GPUs haven't been profitable for a long time, it's mostly ASICs, GPUs can be used for some other altcoins which makes the potential market for used previous generation GPUs even smaller.

blackenedgem 2 hours ago | parent [-]

That assumes you can add compute in a vacuum. If your altcoin receives 10x compute then it becomes 10x more expensive to mine.

That only scales if the coin goes up in value due to the extra "interest". Which isn't impossible but there's a limit, and it's more often to happen to smaller coins.

andix 15 hours ago | parent | prev [-]

Failure rates also go up. For AI inference it’s probably not too bad in most cases, just take the node offline and re-schedule the jobs to other nodes.

rlupi 14 hours ago | parent | prev | next [-]

Do not forget that we're talking about supercomputers. Their interconnect makes machines not easily fungible, so even a low reduction in availability can have dramatic effects.

Also, after the end of the product life, replacement parts may no longer be available.

You need to get pretty creative with repair & refurbishment processes to counter these risks.

loeg 10 hours ago | parent | prev | next [-]

It's option #2. But 5 year deprecation is optimistic; 2-3 years is more realistic.

marcosdumay 9 hours ago | parent | prev | next [-]

Historically, GPUs have improved in efficiency fast enough that people retired their hardware in way less than 5 years.

Also, historically the top of the line fabs were focused on CPUs, not GPUs. That has not been true for a generation, so it's not really clear if the depreciation speed will be maintained.

chii 6 hours ago | parent [-]

> that people retired their hardware in way less than 5 years.

those people are end-consumers (like gamers), and only recently, bitcoin miners.

Gamers don't care for "profit and loss" - they want performance. Bitcoin miners do need to switch if they want to keep up.

But will an AI data center do the same?

thinkmassive 5 hours ago | parent | next [-]

Mining bitcoin with a GPU hasn't been profitable in over a decade.

TingPing 6 hours ago | parent | prev [-]

The rate of change is equal for all groups. The gaming market can be the most conservative since it’s just luxury.

dmoy 16 hours ago | parent | prev | next [-]

5 years is maybe referring to the accounting schedule for depreciation on computer hardware, not the actual useful lifetime of the hardware.

It's a little weird to phrase it like that though because you're right it doesn't mean you have to throw it out. Idk if this is some reflection of how IBM handles finance stuff or what. Certainly not all companies throw out hardware the minute they can't claim depreciation on it. But I don't know the numbers.

Anyways, 5 years is an infection point on numbers. Before 5 years you get depreciation to offset some cost of running. After 5 years, you do not, so the math does change.

skeeter2020 16 hours ago | parent [-]

that is how the investments are costed though, so makes sense when we're talking return on investment, so you can compare with alternatives under the same evaluation criteria.

coliveira 15 hours ago | parent | prev | next [-]

There is the opportunity cost of using a whole datacenter to house ancient chips, even if they're still running. You're thinking like a personal use chip which you can run as long as it is non-defective. But for datacenters it doesn't make sense to use the same chips for more than a few years and I think 5 years is already stretching their real shelf life.

lithos 15 hours ago | parent | prev | next [-]

It's worse than that in reality, AI chips are on a two year cadence for backwards compatibility (NVIDIA can basically guarantee it, and you probably won't be able to pay real AI devs enough to stick around to make hardware work arounds). So their accounting is optimistic.

Patrick_Devine 10 hours ago | parent [-]

5 years is normal-ish depreciation time frame. I know they are gaming GPUs, but the RTX 3090 came out ~ 4.5 years before the RTX 5090. The 5090 has double the performance and 1/3 more memory. The 3090 is still a useful card even after 5 years.

9 hours ago | parent | prev | next [-]
[deleted]
more_corn 9 hours ago | parent | prev | next [-]

When you operate big data centers it makes sense to refresh your hardware every 5 years or so because that’s the point at which the refreshed hardware is enough better to be worth the effort and expense. You don’t HAVE to, but its more cost effective if you do. (Source, used to operate big data centers)

protocolture 7 hours ago | parent | prev [-]

Actually my biggest issue here is that, assuming it hasnt paid off, you dont just convert to regular data center usage.

Honestly if we see a massive drop in DC costs because the AI bubble bursts I will be stoked.