At scale (like comma.ai), it's probably cheaper. But until then it's a long term cost optimization with really high upfront capital expenditure and risk. Which means it doesn't make much sense for the majority of startup companies until they become late stage and their hosting cost actually becomes a big cost burden.

There are in between solutions. Renting bare metal instead of renting virtual machines can be quite nice. I've done that via Hetzner some years ago. You pay just about the same but you get a lot more performance for the same money. This is great if you actually need that performance.

People obsess about hardware but there's also the software side to consider. For smaller companies, operations/devops people are usually more expensive than the resources they manage. The cost to optimize is that cost. The hosting cost usually is a rounding error on the staffing cost. And on top of that the amount of responsibilities increases as soon as you own the hardware. You need to service it, monitor it, replace it when it fails, make sure those fans don't get jammed by dust puppies, deal with outages when they happen, etc. All the stuff that you pay cloud providers to do for you now becomes your problem. And it has a non zero cost.

The right mindset for hosting cost is to think of it in FTEs (full time employee cost for a year). If it's below 1 (most startups until they are well into scale up territory), you are doing great. Most of the optimizations you are going to get are going to cost you in actual FTEs spent doing that work. 1 FTE pays for quite a bit of hosting. Think 10K per month in AWS cost. A good ops person/developer is more expensive than that. My company runs at about 1K per month (GCP and misc managed services). It would be the wrong thing to optimize for us. It's not worth spending any amount of time on for me. I literally have more valuable things to do.

This flips when you start getting into the multiple FTEs per month in cost for just the hosting. At that point you probably have additional cost measured in 5-10 FTE in staffing anyway to babysit all of that. So now you can talk about trading off some hosting FTEs for modest amount of extra staffing FTEs and make net gains.

▲

PunchyHamster 7 hours ago | parent | next [-]

> At scale (like comma.ai), it's probably cheaper. But until then it's a long term cost optimization with really high upfront capital expenditure and risk. Which means it doesn't make much sense for the majority of startup companies until they become late stage and their hosting cost actually becomes a big cost burden.

You rent a dataspace, which is OPEX not CAPEX, and you just lease the servers, which turns big CAPEX into monthly OPEX bill

Running your own DC is "we have two dozen racks of servers" endeavour, but even just renting DC space and buying servers is much cheaper than getting same level of performance from the cloud.

> This flips when you start getting into the multiple FTEs per month in cost for just the hosting. At that point you probably have additional cost measured in 5-10 FTE in staffing anyway to babysit all of that. So now you can talk about trading off some hosting FTEs for modest amount of extra staffing FTEs and make net gains.

YOU NEED THOSE PEOPLE TO MANAGE CLOUD TOO. That's what always get ignore in calculations, people go "oh, but we really need like 2-3 ops people to cover datacenter and have shifts on the on-call", but you need same thing for cloud too, it is just dumped on programmers/devops guys in the team rather than having separate staff.

We have few racks and the part related to hardware is small part of total workload, most of it is same as we would (and do for few cloud customers) in cloud, writing manifests for automation.

	▲	input_sh 4 hours ago \| parent \| next [-]
		> YOU NEED THOSE PEOPLE TO MANAGE CLOUD TOO. Finally, some sense! "Cloud" was meant to make ops jobs disappear, but they just increased our salary by turning us into "DevOps Engineers" and the company's hosting bill increased fivefold in the process. You will never convince even 1% of devs to learn the ops side properly, therefore you'll still end up hiring ops people and we will cost you more now. On top of that, everyone that started as a "DevOps Engineer" knows less about ops than those that started as ops and transitioned into being "DevOps Engineers" (or some flavour of it like SREs or Platform Engineers). If you're a programmer scared into thinking AI is going to take away your job, re-read my comment.
	▲	mattbillenstein 5 hours ago \| parent \| prev [-]
		Honestly, the way I've seen a lot of cloud done, they need _more_ people to manage that than a sensible private cloud setup.

▲

wobfan 7 hours ago | parent | prev | next [-]

To be fair, I think people are vastly over estimating the work they would have and the power they would need. Yes, if you have to massively scale up, then it'll take some work, but most of it is one-time work. You do it, and when it runs, you only have a fraction of work over the next months to maintain it. And with fraction, I mean below 5%. And keep in mind that >99% of startups who think of "yeah we need this and that cloud, because we need to scale" will never scale. Instead they are happily locking themselves into a cloud service. And if they actually scale at some point, this service will be massively more expensive.

▲

direwolf20 5 hours ago | parent | next [-]

Startups don't know how much hardware they need when they release to customers. The extreme flexibility of cloud makes a lot of sense for them.

	▲	aforwardslash 4 hours ago \| parent \| next [-]
		But they should; cloud wont magically make the architecture scale. A competent CTO should know the limits of the platform, its called "load testing" or "stress testing"; scalability is independent of the provider. Cloud gives you a nicer interface to add resources, granted; but that"s it. As a hear-say anecdote, thats why some startups have db servers with hundreds of gb of ram and dozens of cpus to run a workload that could be served from a 5 year old laptop.
	▲	bombolo 4 hours ago \| parent \| prev [-]
		[dead]

▲

maccard 7 hours ago | parent | prev [-]

We have two on site servers that we use. For various reasons (power cuts, internet outages, cleaners unplugging them) I’d say we have to intervene with them physically about once a month. It’s a total pain in the ass, especially when you don’t have _an_ it person sitting in the office to mind it. I’m in the Uk and our office is in Spain…

But it is significantly cheaper and faster

▲

lelanthran 7 hours ago | parent | prev | next [-]

Your calculation assumes that an FTE is needed to maintain a few beefy servers.

Once they are up and running that employee is spending at most a few hours a month on them. Maybe even a few hours every six months.

OTOH you are specifically ignoring that you'll require mostly the same time from a cloud trained person if you're all-in on AWS.

I expect the marginal cost of one employee over the other is zero.

▲

jillesvangurp 6 hours ago | parent [-]

> Once they are up and running

You should also calculate the cost of getting it up and running. With Google Cloud (I don't actually use AWS), I mainly worry about building docker containers in CI and deploying them to vms and triggering rolling restarts as those get replaced with new ones. I don't worry about booting them. I don't worry about provisioning operating systems or configuration to them. Or security updates. They come up with a lot of pre-provisioned monitoring and other stuff. No effort required on my side.

And for production setups. You need people on stand by to fix the server in case of hardware issues; also outside office hours. Also, where does the hardware live? What's your process when it fails? Who drives to wherever the thing is and fixes it? What do you pay them to be available for that? What's the lead time for spare components? Do you actually keep those in supply? Where? Do you pay for security for wherever all that happens? What about cleaning, AC, or a special server room in your building. All that stuff is cost. Some of it is upfront cost. Some of it is recurring cost.

The article is a about a company that owns its own data center. The cost they are citing (5 million) is substantial and probably a bit more complete. That's one end of the spectrum.

▲

Symbiote 6 hours ago | parent | next [-]

You are massively overcomplicating this.

> I don't worry about booting them. I don't worry about provisioning operating systems or configuration to them. Or security updates. They come up with a lot of pre-provisioned monitoring and other stuff. No effort required on my side.

These are not difficult problems. You can use the same/similar cloud install images.

A 10 year old nerd can install Linux on a computer; if you're a professional developer I'm sure you can read the documentation and automate that.

> And for production setups. You need people on stand by to fix the server in case of hardware issues; also outside office hours.

You could use the same person who is on standby to fix the cloud system if that has some failure.

> Also, where does the hardware live?

In rented rackspace nearby, and/or in other locations if you need more redundancy.

> What's your process when it fails? Who drives to wherever the thing is and fixes it? What do you pay them to be available for that? What's the lead time for spare components? Do you actually keep those in supply? Where?

It will probably report the hardware failure to Dell/HP/etc automatically and open a case. Email or phone to confirm, the part will be sent overnight, and you can either install it yourself (very, very easy for things like failed disks) or ask a technician to do it (I only did this once with a CPU failure on a brand new server). Dell/HP/etc will provide the technician, or your rented datacentre space will have one for simpler tasks like disks.

	▲	abc123abc123 3 hours ago \| parent [-]
		Shush! The cloud companies want customers to think it is a complicated near death experience to run on their own hardware. It is sad that the knowledge of how easy it really is, is getting extinct. The cloud and SaaS companies benefit greatly.

▲

lelanthran 6 hours ago | parent | prev [-]

> You should also calculate the cost of getting it up and running.

I was not doing the calculation. I was only pointing out that it was not as simple as you make it out to be.

Okay, a few other things that aren't in most calculations:

1. Looking at jobs postings in my area, the highest paid ones require experience with specific cloud vendors. The FTEs you need to "manage" the cloud are a great deal more expensive than developers.

2. You don't need to compare on-prem data center with AWS - you can rent a pretty beefy VPS or colocate for a fraction of the cost of AWS (or GCP, or Azure) services. You're comparing the most expensive alternative when avoiding cloud services, not the most typical.

3. Even if you do want to build your own on-prem rack, FTEs aren't generally paid extra for being on the standby rota. You aren't paying extra. Where you will pay extra is for hot failovers, or machine room maintenance, etc, which you don't actually need if your hot failover is a cheap beefy VPS-on-demand on Hetzner, DO, etc.

4. You are measuring the cost of absolute 0% downtime. I can't think of many businesses that have such high sensitivity to downtime. Even banks handle downtime much larger than that even while their IT systems are still up. With such strict requirements you're getting into the spot where the business itself cannot continue because of catastrophe, but the IT systems can :-/. What use is the IT systems when the business itself may be down?

The TLDR is:

1. If you have highly paid cloud-trained FTEs, and

2. Your only option other than Cloud is on-prem, and

3. Your FTEs are actually FT-contractors who get paid per hour, and

4. Your uptime requirements are moire stringent than national banks,

yeah, then cloud services are only slightly more expensive.

You know how many businesses fall into that specific narrow set of requirements?

▲

bambax 7 hours ago | parent | prev | next [-]

> it doesn't make much sense for the majority of startup companies until they become late stage

Here's what TFA says about this:

> Cloud companies generally make onboarding very easy, and offboarding very difficult. If you are not vigilant you will sleepwalk into a situation of high cloud costs and no way out.

and I think they're right. Be careful how you start because you may be stuck in the initial situation for a long time.

▲

ashu1461 7 hours ago | parent | prev | next [-]

And not just any FTEs, probably few senior / staff level engineers who would cost a lot more.

▲

g-b-r 7 hours ago | parent | prev [-]

You should keep in mind that for a lot of things you can use a servicing contract, rather than hiring full-time employees.

It's typically going to cost significantly less; it can make a lot of sense for small companies, especially.