I just ran some massive tests on our own CI. I use AMD Turin for this on gcp, which was noted as one of the fastest ones in the article.

The most insane part here is that the AMD EPYC 4565p can beat the turin's used on the cloud providers, by as much as 2x in the single core.

Our tests took 2 minutes on GCP, 1 minute flat on the 4565p with its boost to 5.1ghz holding steady vs only 4.1ghz on the gcp ones.

GCP charges $130 a month for 8vcpus. ALSO this is for SPOT that can be killed at any moment.

My 4565p is a $500 cpu... 32 vcpus... racked in a datacenter. The machine cost under 2k.

i am trying hard to convince more people to rack themselves especially for CI actions. The cloud provider charging $130 / mo for 3x less vcpus you break even in a couple months, it doesn't matter if it dies a few months later. On top of that you're getting full dedicated and 2x the perf. Anyways... glad to see I chose the right cpu type for gcloud even though nothing comes close to the cost / perf of self racking

▲

AussieWog93 3 hours ago | parent | next [-]

Hetzner charge between €10 and €48 for an 8vcpu setup, depending on how many other users you're happy to share with.

For €104/mo you can get a 16-core Ryzen 9 7950X3D (basically identical to your 4565p) w/ 128GB DDR5, 2x2TB PCIE Gen4 SSD.

That's not to say you're wrong about dedicated being much better value than VPS on a performance per dollar basis, but the markup that the European companies charge is much, much lower compared to what they'd charge in the US.

In this instance you're looking at a ~17 month payback period even ignoring colo fees. Assuming a ~$100 colo fee that sibling comment suggested, you're looking at closer to 8 years.

▲

Aurornis 2 hours ago | parent [-]

Great points. If we’re going to talk about dedicated servers and long lock-in contracts, you have to look at the equivalent prices for hosted alternatives.

It’s fun to start thinking about building your own server and putting in a rack, but there’s always a lot of tortured math to compare it to completely different cloud hosted solutions.

One of the great things about cloud instances is that I can scale them up or down with the load without being locked into some hardware I purchased. For products I’ve worked on that have activity curves that follow day-night cycles or spike on holidays, this has been amazing. In some cases we could auto scale down at night and then auto scale back up during the day. As the user base grows we can easily switch to larger instances. We can also geographically distribute servers and provide lower latency.

There is a long list of benefits that are omitted when people make arguments based solely on monthly cost numbers. If we’re going to talk about long term dedicated server contracts we should at least price against similar options from companies like Hetzner.

	▲	vladvasiliu 8 minutes ago \| parent [-]
		> One of the great things about cloud instances is that I can scale them up or down with the load without being locked into some hardware I purchased. For products I’ve worked on that have activity curves that follow day-night cycles or spike on holidays, this has been amazing. In some cases we could auto scale down at night and then auto scale back up during the day. At work we have this day / night cycle. But for some reason we're married to AWS. If we provisioned 24/7/365 a bunch of servers at Hetzner or such to cover the peaks with some margin, it would still be cheaper by a notable margin. Sure, 90% of them would twiddle their thumbs from 22 PM to 10 AM. So what? Sure, if your clients are completely unpredictable and you'll see x100 traffic without notice, the cloud is great. But how many companies are actually in that kind of situation? Looking back over a year or two, we're quite reliably able to predict when we'll have more visitors and how many more compared to baseline. We could just adjust the headroom to be able to take in those spikes. And I suppose if you want to save the environment, you could just turn off the Hetzner servers while they sit unused.

▲

Aurornis 7 hours ago | parent | prev | next [-]

> My 4565p is a $500 cpu... 32 vcpus... racked in a datacenter. The machine cost under 2k.

> The cloud provider charging $140 / mo for 3x less vcpus you break even in a couple months, it doesn't matter if it dies a few months later

How do you calculate break even in a couple months if the machine costs $2,000 and you still have to pay colo fees?

If your colo fees were $100 month you wouldn’t break even for over 4 years. You could try to find cheaper colocation but even with free colocation your example doesn’t break even for over a year.

▲

zackify 7 hours ago | parent [-]

the 140/mo is for 3x less vcpu, so $420/mo savings if you use all those same cores. sorry for the poor comparison wording there. in a few months already up to $1300+ by 6 months already paid the machine.

colo fees are cheap if you need more than just 1u. even with a 50-100 fee you easily get way more performance and come ahead within a year

▲

Aurornis 7 hours ago | parent [-]

> by 6 months already paid the machine.

You originally said “a couple months” but now it’s 6 months and assumption of $0 collocation fees which isn’t realistic

In my experience situations rarely call for precisely 32 cores for a fixed period of 3 years to support calculations like this anyway. We start with a small set of cloud servers and scale them up as traffic grows. Today’s tooling makes it easy to auto scale throughout the day, even.

When trying to rack a server everyone aims higher because it sucks to start running into limits unexpectedly and be stuck on a server that wasn’t big enough to handle the load. Then you have to start considering having at least two servers in case one starts failing.

Racking a single self-built server is great for hobby projects but it’s always more complicated for serving real business workloads.

▲

edoceo 7 hours ago | parent | next [-]

Don't nit-pick the "couple". It was used casually - like to mean not terribly long time. So the 2-6 spread, while technically big, is still just a trifle. While I'm nit-picking; up thread is talking about a limited box for CI and you're talking about scaling up real business workloads. That's just like the difference between 2 and 6. Give it a rest.

Everyone: run your scenarios and expectations in a spreadsheet and then use real data to run your CBA. Your case will be unique(ish) so make your case for your situation.

	▲	Aurornis 2 hours ago \| parent \| next [-]
		> So the 2-6 spread, while technically big, is still just a trifle. I think you’re misreading. Even the 6 month thread was based on invalid assumptions of $0 collocation fees. Add in even cheap collocation fees and it’s pushed out even further That’s not really a nit pick when the claims were based on impossible math. It’s more of a Motte and Bailey where they come in with a “couple of months” claim that sounds awesome on the surface but then falls back to a completely different number if anyone looks at the details.
	▲	zackify 6 hours ago \| parent \| prev [-]
		yeah thanks for that i was just meaning a very fast return

▲

jjmarr 3 hours ago | parent | prev [-]

You can take a hybrid approach and use the rack for base capacity, cloud for scaling.

▲

oDot 7 hours ago | parent | prev | next [-]

I used to run a site that compares prices[0]. Not only is the ecosystem pull to the cloud strong, but many developers today look at bare metal as downright daunting.

Not sure where that fear comes from. Cloud challenges can be as or more complex than bare metal ones.

[0]: https://baremetalsavings.com/

▲

satvikpendem 4 hours ago | parent | next [-]

> Not sure where that fear comes from.

Probably because most developers these days have not known a world without using cloud providers, with AWS being 20 years old now.

▲

jmathai 4 hours ago | parent [-]

Racking your own hardware doesn’t get you web UIs and APIs out of the box. At least it didn’t 2 decades ago.

	▲	satvikpendem 4 hours ago \| parent [-]
		Sure, now it does however (via the many OSS PaaS) so the calculus must also therefore change.

▲

jbverschoor an hour ago | parent | prev | next [-]

Partitioning a server! Omg lol

It’s funny, bc AWS did not start this tour of business. What they did do is make it possible to pay by the hour. The ephemeral spare compute is what they started.

Yet almost nobody understood the ephemeral part.

You might even be better off running a macmini at home fiber, especially for backend processing

▲

hamandcheese 5 hours ago | parent | prev | next [-]

> Cloud challenges can be as or more complex than bare metal ones.

Big +1 to this. For what I thought was a modest sized project it feels like an np-hard problem coordinating with gcloud account reps to figure out what regions have both enough hyperdisk capacity and compute capacity. A far cry from being able to just "download more ram" with ease.

The cloud ain't magic folks, it's just someone else's servers.

(All that said... still way easier than if I needed to procure our own hardware and colocate it. The project is complete. Just delayed more than I expected.)

▲

keepamovin 5 hours ago | parent | prev [-]

The fragmentation and friction! Comparing prices usually requires 10 open browser tabs and a spreadsheet, which is what keeps people locked into their default cloud. I built a tool to solve this called BlueDot (ie, Earth, where all the clouds are)[0]. It’s a TUI that aggregates 58,000+ server configurations across 6 clouds (including Hetzner). It lets you view side-by-side price comparisons and deploy instantly from the terminal. It makes grabbing a cheap Hetzner box just as easy as spinning up something on AWS/GCP.

[0]: https://tui.bluedot.ink

▲

bob1029 an hour ago | parent | prev | next [-]

> i am trying hard to convince more people to rack themselves especially for CI actions.

What do you think the typical duty cycle is for a CI machine?

Raw performance is kind of meaningless if you aren't actually using the hardware. It's a lot of up front capex just to prove a point on a single metric.

▲

darkwater 2 hours ago | parent | prev | next [-]

Yeah I expected this benchmark to include hosted "metal" hardware with the "per instruction cost" benchmark to see how provider like Hetnzer fare against classic AWS VMs. It's a bit apple to oranges I know, but I think nowadays is what most people compared pure performance cost are interested in. I'm not going to migrate from AWS VMs to GCP or Hetzner VMs, but I might be open to Hetzner hosted servers instead for a massive enough cost reduction.

	▲	justinclift 34 minutes ago \| parent [-]
		> ... but I might be open to Hetzner hosted servers instead for a massive enough cost reduction. Don't use Hetzner for anything actually important to you. :( As to why: https://news.ycombinator.com/item?id=45481328

▲

vmg12 6 hours ago | parent | prev | next [-]

You can go on OVH and get a dedicated server with 384 threads and a Turin cpu for $1147 a month. You have to pay $1147 for installation and the default has low ram and network speeds but even after upgrading those it's going to be 1/5 of what it would cost on public clouds.

▲

icelancer 7 hours ago | parent | prev | next [-]

Self-racking lets you rack a bunch of gear you'd never find in VM/dedicated rentals, like consumer parts or older, still very good parts. Overclocking options are available as well if you DIY.

If you need single-threaded performance, colo is really the only way to go anyway.

We have two full racks and we're super happy with them.

▲

Melatonic 5 hours ago | parent [-]

Or under clocking and under volting for even better performance to price/power/longevity ratios

	▲	sroussey 4 hours ago \| parent \| next [-]
		For a single rack, you really don’t have too many choices for power. You make a choice to provision and pay, I never had anyone check how much of that I used and give me money back. Maybe things have changed though.
	▲	icelancer 5 hours ago \| parent \| prev [-]
		No doubt. Especially for GPU inference at scale. We overclock/overvolt for training and tune way down for inference.

▲

tempay 6 hours ago | parent | prev | next [-]

This is basically the premise of https://www.blacksmith.sh/ as far as I know, though without the need to host the hardware yourself and the potential complexity they comes with that.

	▲	sroussey 4 hours ago \| parent [-]
		I did have some MySQL servers racked for over a decade and I was afraid to restart the machines. And yes as new versions of MySQL came out I did have to compile them myself. Similar lower specced machines that were closer to the public internet had boot disk failures, but I had a few of them, so it wasn’t an issue. Spinning metal and all. One of the db servers dying would have required a next day colo visit… so I never rebooted.

▲

ahartmetz 6 hours ago | parent | prev | next [-]

"VCPUS" are a bit of scam in my experience. You usually don't get what the hardware (according to /proc/cpuinfo) is capable of.

▲

dkechag 7 hours ago | parent | prev | next [-]

A 16-core 4565p is of course faster in max single thread speed than a 96-core that GCP is running at an economically optimal base clock.

A year ago I gave a talk about optimizing Cloud cost efficiency and I did a comparison of colocation vs cloud over time. You might find it interesting here, linking to the relative part: https://youtu.be/UEjMr5aUbbM?si=4QFSXKTBFJa2WrRm&t=1236

TLDR, colocation broke even in 6 to 18 months for on-demand and 3y reserve cloud respectively. But spot instances can actually be quite cheaper than colocation.

You generally don't go to the cloud for the price (except if we are talking hetzner etc).

▲

alberth 3 hours ago | parent | prev | next [-]

Both Datapacket & OVH have the 4565p.

This proc is a hidden gem.

For most workloads it’s not just the most performant, but also the best bang-for-buck.

	▲	nerdsniper 3 hours ago \| parent [-]
		I don't see the 4565P at Datapacket or OVH. But that doesn't invalidate your comment.

▲

api 6 hours ago | parent | prev [-]

Big cloud is ludicrously expensive. It’s truly amazing. Bandwidth is even worse. It’s like a 10000X markup.

▲

sroussey 4 hours ago | parent [-]

It’s wild that no one knows just how cheap bandwidth really is. AWS pulled one over on people and it’s like the movie studios still demanding 10% of the top for VHS distribution. Today.

	▲	jbverschoor an hour ago \| parent [-]
		That’s with every industry Make things look like a complicated black box. Make sure it feels scary to roll your own. Hide the core technical skills behind abstracted skills