Remix.run Logo
shubhamjain 6 hours ago

If you think you need to spend $100B, does using a third-party cloud provider still make sense? It doesn’t matter what sweet deal Amazon is pitching—in that scenario, you’d want to own your stack. Especially in a hyper-competitive field like this, where margins are going to matter a lot soon.

It feels like these hyperscalers are just raising as much as they can giving extremely rosy projections becauses these sooner or later peak is going to be reached (if that hasn’t happened already)

IMTDb 4 hours ago | parent | next [-]

The problem is that at that scale, the alternative is building your own data centers. You'd probably want at least 2 in the US, 2 in Europe, 2 in Asia, maybe 1 in Africa and 1 in LATAM. So 8-10, and you need at least half of them ready "on time."

What does "on time" mean? You'll need to negotiate with local authorities, some friendly, some not. Data centers aren't exactly popular neighbors these days. Then negotiate with the local power utility. Fingers crossed the political landscape doesn't shift and your CEO doesn't sign a contract with an army using your product to pick bombing targets, because you'll watch those permits evaporate fast.

Then there's sourcing: CPUs, GPUs, memory, networking. You need all of it. Did you know the lead time for an industrial power transformer is 5+ years? Don't get me started on the water treatment pumps and filters you can't even get permitted without. What will you do in the meantime ? You surely aren't gonna get preferential treatment from AWS / Google / ... if they know you are moving away anyway. Your competition will.

The risk and complexity are just too big. AI/LLM is already an incredibly complex and brittle environment with huge competition. Getting distracted building data centers isn't enticing for these companies, it's a death sentence.

electroly 4 hours ago | parent | next [-]

For AI inference you don't need to geographically distribute your data centers. Latency, throughput, and routes don't matter here. When it's 10 seconds for the first token and then a 1KB/sec streamed response, whatever is fine. You can serve Australia from the US and it'll barely matter. You can find a spot far outside populated areas with cheap power, available water, and friendly leadership, then put all of your data centers there. If you're worried about major disasters, you can pick a second city. You definitely don't need a data center in every continent.

You're not wrong about the rest but no AI company would ever build a data center in every continent for this, even if they were prepared to build data centers. AI inference isn't like general purpose hosting.

pohl 3 hours ago | parent | next [-]

Sounds like you're betting that the performance users experience today will be the same as the performance they'll expect tomorrow. I wouldn't take that bet.

electroly 3 hours ago | parent | next [-]

You mean that if you were Anthropic, you'd build the data centers on every continent? Can you explain your reasoning?

We're talking about billions of dollars of extra capex if you take the "let's build them everywhere" side of the bet instead of "let's build them in the cheapest possible place" side. It seems to me that you'd have to be really sure that you need the data center to be somewhere uneconomical. I think if you did build them in the cheap place, it's a safe bet that you'll always have at least enough latency-insensitive workloads to fill it up. I doubt that we would transition entirely to latency-sensitive workloads in the future, and that's what would have to happen for my side of the bet to go wrong. The other side goes wrong if we don't see a dramatic uptick in latency-sensitive inference workloads. As another comment pointed out, voice agents are the one genuinely latency-sensitive cloud inference workload we have right now; they do need low latency for it. Such workloads exist, but it's a slim percentage so far.

I believe I'm taking the safe bet that lets Anthropic make hay while the sun shines without risking a major misstep. Nothing stops them from using their own data centers for cheap slow "base load" while still using cloud partners for less common specialized needs. I just can't see why they would build the international data centers to reduce cloud partner costs on latency-sensitive workloads before those workloads actually show up in significant numbers.

PunchyHamster an hour ago | parent | prev [-]

You can build geographically close one tomorrow, when you start earning money today. US-EU latency is like 100ms, AI can handle it just fine

TSiege 4 hours ago | parent | prev [-]

latency absolutely matters? this is such a weird thing to say. for training sure, but customers absolutely want low latency

electroly 4 hours ago | parent | next [-]

They want it, sure. Customers want everything if it's free, but this is about what they value with their money. In this thought experiment, you're Anthropic, not the customer. You're making a choice that's best for Anthropic. Will Anthropic lose customers because the latency is higher? No way. Customers want low cost and lots of usage more than they want low latency. In a cutthroat race to the bottom, there's no room to "give away" massively expensive freebies like a data center near every population center when the customer doesn't value those extras with actual money. It's the same reason we all tolerate the relatively slow batched token generation rate--the batching dramatically lowers the cost, and we need low cost inference more than we want fast generation. If the cost goes up we'll actually leave, for real.

After the initial announcement of "fast mode" in Claude Code, did you ever hear about anyone using it for real? I didn't. Vanishingly few people are willing to pay extra for faster inference.

Remember that the time-to-first-token is dominated by the time to process the prompt. It's orders of magnitude more latency than the network route is adding. An extra 200 milliseconds of network delay on a 5-10 second time-to-first-token is not even noticeable; it's within the normal TTFT jitter. It would be foolish to spend billions of dollars to drop data centers around the world to reduce the 200 milliseconds when it's not going to reduce the 5-10 seconds. Skip the exotic locales and put your data centers in Cheap Power Tax Haven County, USA. Perhaps run the numbers and see if Free Cooling City, Sweden is cheaper.

beisner 2 hours ago | parent [-]

They’re unwilling to pay for fast mode because of the current step function price increase once you hit your quota. It’s a psychological effect. Because most shops I know in the US currently paying $125/mo per seat for Claude would happily - HAPPILY - pay 2x, and begrudgingly pay 10x that amount for the same service. If fast mode was priced 25% or 50% more they’d happily pay for that too. But it’s just not priced that way currently with weird growth subsidization & psychology.

CuriouslyC 3 hours ago | parent | prev | next [-]

The only AI use case that cares about latency is interactive voice agents, where you ideally want <200ms response time, and 100ms of network latency kills that. For coding and batch job agents anything under 1s isn't going to matter to the user.

coredog64 22 minutes ago | parent | next [-]

A customer service chatbot can require more than one LLM call per response to the point that latency anywhere in the system starts to show up as a degraded end-user experience.

electroly 3 hours ago | parent | prev [-]

tbh, that's a good point about the voice agents that I hadn't considered. I guess there are some latency-sensitive inference workloads. Thanks for pointing that out.

devolving-dev 2 hours ago | parent [-]

Yeah, also stuff like robotics which might not really exist today but could be big in the future.

blmarket 2 hours ago | parent | prev [-]

Easy solution - use hyperscalers with super expensive API charge only when latency really matters. Otherwise build your own DC. Easy to expect customers don't care latency that much over money.

hn_throwaway_99 an hour ago | parent | prev | next [-]

Maybe for right now, but even in the very near future it seems like data center expertise would absolutely be a core competency of any AI leaders.

Heck, look at Facebook. Granted, they got started slightly before AWS, but not by much. Owning all of their own data centers is a huge competitive advantage for them, and unlike most of the other hyperscalers they don't sell compute to other companies (AFAIK).

Again, the commitment is for $100 billion in spend. Building lots of data centers for a lot cheaper than that price should absolutely be doable. Also, geographic distribution isn't nearly as important for AI companies given the way LLMs work. The primary benefit of being close to your data center is reduced latency, but if you think about your average chatbot interface, inference time absolutely swamps latency, so it's not as big a deal. Sure, you'd probably need data centers in different locales for legal reasons, and for general diversification, but, one more time, $100 billion should buy a lot of data centers.

amluto 4 hours ago | parent | prev | next [-]

Other than data sovereignty, does the data center location really matter that much? Current inference systems are not exactly low latency.

Aurornis 4 hours ago | parent | next [-]

It’s the power and water needs.

Large data centers consume as much power as a small city. The location decision is about being able to connect to a power grid that is ready to supply that.

Evaporative cooling also needs steady water supply. There are data centers which don’t operate on evaporative cooling but it’s more equipment intensive and expensive.

Latency doesn’t matter. You can get fast enough internet connected to these sites much more easily than finding power.

dec0dedab0de 3 hours ago | parent | prev | next [-]

Location matters for disaster recovery, if they want to survive WWIII. Though I think Data Sovereignty is probably a bigger thing, especially if they're going to be selling to governments around the world.

YetAnotherNick 2 hours ago | parent [-]

Why do they need to sell to government around the world. I mean I highly doubt Europe governemnt is in the top 100 customer of any US lab.

sophacles 4 hours ago | parent | prev [-]

* not every task is waiting on the inference. lowering latency on other, serial tasks, can still have a noticable effect. Login, mcp queries, etc.

* data transit across the world can be very slow when there's network issues (a fiber is cut somewhere, congestion, bgp does it's thing, etc). having something more local can mitigate this

* several countries right now have demented leaders with idiotic cult-like followers. Best not to put all your eggs in those baskets.

* wars, earthquakes, fires, floods, and severe weather rarely affect the whole planet at once, but can have rippling effects across a continent.

And frankly, the real question isn't "why spread out the DCs?", its "what reason is there to put them close to each other?".

RealityVoid 2 hours ago | parent | prev | next [-]

Take the approach Geohot is suggesting. Take a shipping container, make a standard layout, cooling and compute load. Find a cheap source of electricity.. Place it and have compute.

whattheheckheck 2 hours ago | parent [-]

Surely if it was that easy it'd be done?

mech422 an hour ago | parent [-]

It has been done... We used to get our POP gear built out from Dell (?) in shipping containers - pre-racked, wired, and cooled - just add network/power feeds. We'd have them dropped places we needed more capacity but there wasn't space available in the DC.

mistrial9 3 hours ago | parent | prev | next [-]

not sure what you are describing, however a random item is that in 2026 low-tech Chile is building sixty datacenters in or near Santiago, in the business news.

imtringued 3 hours ago | parent | prev [-]

Translation: Antropic never intends to spend $100 billion on AWS.

Every single argument you've brought up is irrelevant in the face of billions of dollars. If you intend to consume $100 billion dollars in data center infrastructure, you're going to find a way to accomplish it while cutting out the middlemen.

Meanwhile if you're flaky and never intend to spend that money, you're going to come up with a way to pay someone else to deal with those problems and quit paying the moment they don't.

You'd never do both at the same time. You'd never commit your money and give them control over your business critical infrastructure.

Hence the deal is a sham. The $100 billion are a lie. Thank you for telling us.

MeetingsBrowser 5 hours ago | parent | prev | next [-]

Going from a company with no experience building and operating datacenters to a company with 100B worth of compute is a multi-decade high risk goal.

MrBuddyCasino 4 hours ago | parent [-]

xAI built a datacenter in a few weeks, if I remember correctly.

Aurornis 4 hours ago | parent | next [-]

That’s PR hype. They built it quickly, but they didn’t go from deciding they wanted a data center to having it running in weeks.

You can’t even get the hardware at that scale without months or years of order lead time. NVidia doesn’t have warehouses full of compute hardware waiting for someone to come get it.

They also reused an existing building. Basically, they put 100,000 GPUs into a building and attached the necessary infrastructure in about half a year. Impressive, but it’s not the same as a $10B/year data center usage commitment like this deal.

imtringued 3 hours ago | parent [-]

Why does this matter? The deal is supposed to last 10 years. If you don't pay AWS to order Nvidia GPUs for you, Nvidia won't have to deliver them to AWS, they will have exactly the same quantity of GPUs, but this time they can deliver to you.

drw85 3 hours ago | parent [-]

Because you can spend your 100 billion dollars spread over 10 years.

If you build datacenters, you have to spend that money now.

They're also not paying amazon to order GPUs, they're paying for compute usage of whatever hardware they have.

0xbadcafebee 4 hours ago | parent | prev | next [-]

And they used illegal power to do it (which will now give local poor people health disorders at 4x the national average). They likely violated every law possible in the process, like OSHA standards, overtime. Musk loves to overwork people.

MeetingsBrowser 4 hours ago | parent | prev [-]

xAI built the Colossus data center in 122 days (just the physical construction time).

Colossus initially had ~200k GPUs. 100B buys you ~1 million high end GPUs running 24/7 for a year at AWS retail prices.

Aurornis 4 hours ago | parent [-]

Initial Colossus buildout was 100K GPUs

They also reused an existing building that happened to be in the right place at the right time. The larger data center buildouts would almost always need new, dedicated construction.

dktp 6 hours ago | parent | prev | next [-]

I think these pledges offload some of the risk onto Amazon/Oracle/etc

If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition

If they built it themselves and missed projections it's a much more expensive mistake

It's just risk sharing. Infra providers take some of the risk and some of the upside

throwup238 5 hours ago | parent [-]

> If they built it themselves and missed projections it's a much more expensive mistake

Not if their pricing comes with multiyear commitments for reserved pricing. No doubt they get a huge volume discount but the advertised AWS reserved pricing is already enough for pay for a whole 8x HX00 pod plus the NVIDIA enterprise license plus the staff to manage it after only a one year commitment. On-demand pricing is significantly more expensive so they’re going to be boxed in by errors in capacity planning anyway (as has been happening the last few months).

The economics here are absurd unless you’re involved in a giant circular investment scheme to pump up valuations.

dweekly 5 hours ago | parent [-]

The pricing models that are published on AWS' website almost certainly have almost nothing to do with the pricing models that are discussed behind closed doors for a $100 billion commitment.

throwup238 4 hours ago | parent | next [-]

Of course not, but unless they’re getting the sweet heart deal of a lifetime from Amazon of all places, it’s still a hogwash. We’re talking about enough capital to build their own fab and a dozen datacenters*. This deal isn’t going to be buying existing capacity because that’s already stretched, it will be paying for new buildouts.

Afterwards Amazon will be milking the machines these commitments buy for nearly a decade. That tradeoff makes sense at a small scale (even up to $X00 million or even billions), but at $Y0 or $Z00 billion?

Color me skeptical. There are plenty of other side benefits like upgrading to the newest GPUs every few years, but again we’re talking about paying for new buildouts with upfront commitments anyway.

* obviously the timelines, scientific risk, and opportunity cost make this completely infeasible but that’s the scale we’re talking about. It’s a major industrial project on the scale of the thirty year space shuttle program (~$200 billion).

coredog64 17 minutes ago | parent [-]

You can get a significant AWS discount with an annual spend starting around $1M/year.

5 hours ago | parent | prev [-]
[deleted]
credit_guy 6 hours ago | parent | prev | next [-]

Here’s the answer to your queation (from the article)

> The Anthropic deal specifically covers Trainium2 through Trainium4 chips, even though Trainium4 chips are not currently available. The latest chip, Trainium3, was released in December. On top of that, Anthropic has secured the option to buy capacity on future Amazon chips as they become available.

deskamess 5 hours ago | parent | next [-]

So it comes down to how much of that $100 bn is in the 'option', I guess. Then it's not an expense at all.

superkuh 5 hours ago | parent | prev [-]

Ah. So it's a scalper situation where an unethetical entity buys up all the supply and then resells it for a greater price.

t0mas88 3 hours ago | parent [-]

Amazon isn't buying and reselling Trainium chips, those are their in house developed custom chips.

neya 4 hours ago | parent | prev | next [-]

I remember seeing this extremely shocking graph of top AI companies on Facebook on how the money just keeps changing hands between a handful of companies. Almost seemed like a scam.

neffy 2 hours ago | parent | next [-]

It is a similar kind of lending loop to that which went on during the late 1990's leading up to the 2000 crash. A lends to B lends to C lends to A.

There is a famous quote from the polish economist Kalecki, that "economics is the science of mistaking a stock for a flow". Essentially this form of lending continues while everybody can make interest payments, and blows up horribly as soon as somebody can´t - as I have no doubt all those concerned are fully aware.

Aurornis 4 hours ago | parent | prev [-]

Money doesn’t just flow around with nothing exchanged. The money is in payment for goods and services.

It’s common even for smaller companies to do mutually beneficial business with each other. It’s actually helpful to do business with people who are also your customers because you have a relationship with them and you also have leverage: They are extra incentivized to treat you well because they don’t want to upset any of the other business you have with them.

JumpCrisscross 5 hours ago | parent | prev | next [-]

> It doesn’t matter what sweet deal Amazon is pitching

Isn't that almost all that matters when comparing doing something yourself versus paying someone else, in this case Amazon, to do it for you?

etempleton 5 hours ago | parent | prev | next [-]

In a rationale business yes, but when everything is basically some form of growth signal to investors to extract even more money from them before the music stops it doesn’t matter.

LogicFailsMe 6 hours ago | parent | prev | next [-]

Classic time value of money situation. They get access to the HW now so they can continue to grow the business. Of course, if you think AI is just pets.com redux, I can see how you'd think it's already peaked. All those years of very important people insisting Bezos couldn't just pull a switch on reinvesting all the revenue into growing Amazon and then he did exactly that comes to mind.

bombcar 5 hours ago | parent | prev | next [-]

If you’re sure it’s going to go gangbusters you want to get it all in-house asap.

If you’re not sure it’s going to blow the socks off, foisting capital investment on partners is a great deal.

See the difference in companies/franchises that always own the land/building and those that always lease.

samdixon 5 hours ago | parent | prev | next [-]

From my understanding, if you want to use native Claude in AWS Bedrock, it runs from an AWS datacenter. I'm guessing that's why regardless of running your own stack... they still need a footprint in all the major clouds.

nashashmi 4 hours ago | parent | prev | next [-]

No. I am guessing that this is only a commitment and they will waver on committing.

However there are certain advantages like supply chain that only established companies would have access to. This is also a commitment to spend upto 100B on internal approach and research. I would expect them to come up with their own cpu chip and device design. This will shift the focus to an internal approach. And might make amazon give better prices later down the line

lubujackson 5 hours ago | parent | prev | next [-]

Look at GPU and RAM prices and data center rollout. We have quickly reached Earth's capacity for compute - it is a lot like the housing market. Once there is global saturation, the price to buy becomes increasingly high EVERYWHERE. Let's also not forget that Anthropic moves the market with their purchases and usage. They might literally be unable to buy capacity they need (or project to) and are doing this deal to pave a roadmap for the near-term and to keep global prices (somewhat) down.

JumpCrisscross 5 hours ago | parent [-]

> We have quickly reached Earth's capacity for compute

Why this versus us being in a temporary bottleneck? Like, railroads became expensive to build everywhere in the 19th century not because we reached Earth's capacity for railroads or whatever, but because we were still tooling up the industry needed to produce them at higher scales.

jimjeffers 3 hours ago | parent | prev | next [-]

My guess is they are bound not by capital as much as they are physical resources. Amazon probably has the land, crews, etc. to build out more data centers faster than Anthropic can right now. The scarce resources are the chips and electricians not the money!

tahoeskibum 3 hours ago | parent | prev | next [-]

That is why only SpaceX/X.ai has the true advantage...

hnav 3 hours ago | parent [-]

maybe in the game of promising ludicrous things. There's no realistic plan to put compute in space.

dgellow 4 hours ago | parent | prev | next [-]

Anthropic also has their own servers

bilekas 5 hours ago | parent | prev | next [-]

I imagine it comes down to if they want to buy hardware every generation, that gets very expensive and depreciates quickly. You've then got a whole load of assets on your books that are technically obsolete for the bleeding edge. This way, AWS buys and maintains the hardware and OpenAI doesn't need to claim it as depreciation ?

Just a guess.

Tepix 6 hours ago | parent | prev | next [-]

Sure: If you can't get enough compute by ordering it yourself, make deals with anyone who promises to get you more compute.

dec0dedab0de 3 hours ago | parent | prev | next [-]

They're not trying to build a sustainable business. They're trying to get as much market share and lock-in as possible before the bubble bursts. This makes a ton of sense from that perspective. It probably would be cheaper for them in the long run to own their own hardware, but they are paying AWS for their expertise so they can focus on what they do. If it doesn't work out, it also sets them up for a merger with Amazon.

I do think a ton of businesses would benefit from running their own hardware, but they're not getting five billion dollars to stay on the cloud.

0xbadcafebee 4 hours ago | parent | prev | next [-]

There is no money or time left to build a $100B stack. All private capital is tapped and banks know it's too risky. They have no choice but to rent.

nickorlow 4 hours ago | parent | prev | next [-]

AWS exists and has compute right now, spinning up their own HW would take months (at least). This gets them moving quicker.

avereveard 5 hours ago | parent | prev | next [-]

Cannot get Tranium anywhere else and NVIDIA commands a super high premium.

DANmode 4 hours ago | parent | prev | next [-]

> you’d want to own your stack.

Everybody does right now, right?

But: is it your core competency?

Can your firm afford the distraction?

vasco 5 hours ago | parent | prev | next [-]

That is a project you can work on at any point in the future and the more you delay it the more certain your investment will be about what you really need. But those additions to the PnL are capped to the costs.

In the meantime if you work on revenue generating work, that side of PnL is uncapped. So you can either put some engineers on reducing your costs at most by 100% or, if they worked on product ideas they could be working on things that generate over 9000% more revenue.

Zababa 6 hours ago | parent | prev | next [-]

I think it could make sense to not want to own the stack if you think it's going to cost you velocity/focus? Which is probably the play here. But I'm not certain at all.

loveparade 6 hours ago | parent | prev | next [-]

Good lucking getting GPUs.

Culonavirus 6 hours ago | parent | prev | next [-]

Only Google and xAI build their own, no? I don't think it's that easy to vertically integrate massive datacenters into a software company. Both Google and xAI (Tesla, SpaceX) have a massive wealth of experience when it comes to building factories.

tren_hard 4 hours ago | parent | next [-]

Facebook and Oracle also build their own, at least before the last couple years where they’ve financed out to new bag holders.

jeffbee 6 hours ago | parent | prev [-]

New level of glazing Elon Musk unlocked. xAI has a vertical integration advantage because Tesla once moved into an old Toyota factory and because once they paid Panasonic to put a Tesla sign outside a Panasonic battery factory. Incredible content.

petesergeant 5 hours ago | parent [-]

I would struggle to dislike Elon more, but this seems like you’re some kind of weird anti-Musk fanatic

mitchell_h 6 hours ago | parent | prev [-]

I watched some explain how deepseak got good and the Chinese approach to LLM training. Really wish I could remember it. The premise was China thinks of LLMs not as a thing separate from hardware, but gains efficiencies at each layer of the stack. From Chips to software, it's all integrated and purpose built for training.

Wonder if Anthropic is making a mistake by focusing on "consumer" hardware, and not going super specialized.

jubilanti 5 hours ago | parent | next [-]

So you watched some random video from some random YouTuber, didn't even remember who made it, so much so you didn't even remember that deepseek isn't spelled "deapseak", didn't bother to even find it or verify, and then you go asserting your memory as fact on a serious discussion forum.

Comments like yours add nothing to the discussion.

throwa356262 5 hours ago | parent | next [-]

I belive he does have a valid point.

You can throw money and hardware at a problem, but then someone may come along with a great idea and leapfrog you.

Just consider that all major AI providers now use deepseeks ideas for efficient training from that first paper.

1738384848 4 hours ago | parent | prev [-]

thank you for the aerious discussion my good sir I tip my hat to you

elefanten 6 hours ago | parent | prev | next [-]

DeepSeek uses merchant silicon like everyone else.

edit: I misunderstood, I thought you were implying they designed their own GPUs. nevermind

notyourday 5 hours ago | parent | prev | next [-]

> I watched some explain how deepseak got good and the Chinese approach to LLM training.

I distinctly remember reading a big pantie twisting from Sam Altman and Co that Chinese took their stuff, the stuff OpenAI and Co spent billions to create, and used that as the base for $0.00

renewiltord 5 hours ago | parent | prev | next [-]

It’s fake news predicated on China not being able to get GPUs. But it turns out everyone was getting them their GPUs by serial number swaps in warehouse.

6 hours ago | parent | prev [-]
[deleted]