I don't know how many times I need to say this, but I will die on this hill.

Centralized services don't decrease redundancy. They're usually far more redundant than whatever homegrown solution you can come up with.

The difference between centralized and homegrown is mostly psychological. We notice the outages of centralized systems more often, as they affect everything at the same time instead of different systems at different times. This is true even if, in a hypothetical world with no centralization, we'd have more total outage time than we do now.

If your gas station says "closed" due to a problem that only affects their own networks, people usually go "aah they're probably doing repairs or something", and forget about the problem 5 minutes later. If there's a Cloudflare outage... everybody (rightly) blames the Cloudflare outage.

Where this becomes a problem is when correlated failures are actually worse than uncorrelated ones. If Visa goes down, it's better if Mastercard stays up, because many customers have both and can use the other when one doesn't work. In some ways, it's better to have 30 mins of Visa outages today and 30 mins of Mastercard outages tomorrow, than to have just 15 mins of correlated outages in one day.

▲

freeplay an hour ago | parent | next [-]

The problem is creating a single point of failure.

There's no doubt a VM in AWS is exponentially more redundant than my VM running on a couple of Intel NUCs in my closet.

The difference is, when I have a major outage, my blog goes down.

When EC2 has a major outage, all of the blogs go down. Along with Wikipedia, Starbucks, and half the internet.

That single point of failure is the issue.

	▲	YetAnotherNick 41 minutes ago \| parent [-]
		Single point of failure means exactly opposite of what you think it means. If my work depends on 5 services to be up, each service would be a single point of failure, and correlation of failure is good for probability that I can do my work.

▲

lloeki 3 hours ago | parent | prev | next [-]

"redundancy" might not be there correct word. If we had a single worldwide mega-entity serving 100% of the internet it would be both a monopoly and would have tons of redundant infrastructure.

But it would also be quite unified; the system, while full of redundancies, as a whole is a unique one operated the same way end to end; by virtue of it being a single system handled in a uniform way, a single glitch could bring it all down. There is no diversity in the system's implementation, the monoculture itself makes it vulnerable.

▲

dgan 8 hours ago | parent | prev | next [-]

> Centralized services don't decrease redundancy

Alright, but it creates a failure correlation where previously there was none

	▲	silvestrov 3 hours ago \| parent [-]
		Have you ever heard of the "sendmail worm", aka Morris Worm ? https://en.wikipedia.org/wiki/Morris_worm You can definitely have failure correlation without having centralized services.

▲

masfuerte 5 hours ago | parent | prev [-]

In my experience services aren't failing due to a lack of redundancy but due to an excess of complexity. With the move to the cloud we are continually increasing both redundancy and complexity and this is making the problem worse.

I have a cheap VPS that has run reliably for a decade except for a planned hour of downtime. Which was in the middle of the night when no-one cared. Amazon is more reliable in theory. My cheap VPS is more reliable in practice.