Oh no, we had 30 minutes of downtime this year :(

CableNinja 9 hours ago | parent | next [-]

5 9's is like 7 minutes a year. They are breaking SLAs and impacting services people depend on

Tbh though this is sort of all the other companies fault, "everyone" uses aws and cf and so others follow. now not only are all your chicks in one basket, so is everyone elses. When the basket inevitably falls into a lake....

Providers need to be more aware of their global impact in outages, and customers need to be more diverse in their spread.

▲

world2vec 8 hours ago | parent | next [-]

99.999% availability is around 5 minutes or so of downtime per year.

▲

weird-eye-issue 8 hours ago | parent | prev [-]

> Providers need to be more aware of their global impact in outages

So you think the problem is they aren't "aware"?

▲

CableNinja 8 hours ago | parent [-]

These kinds of outages continue to happen and continue to impact 50+% of the internet, yes, they know they have that power, but they dont treat changes as such, so no, they arent aware. Awareness would imply more care in operations like code changes and deployments.

Outages happen, code changes occur; but you can do a lot to prevent these things on a large scale, and they simply dont.

Where is the A/B deployment, preventing a full outage? What about internally, where was the validation before the change, was the testing run against a prodlike environment or something that once resembled prod but hasnt forever?

They could absolutely mitigate impacting the entire global infra in multiple ways, and havent, despite their many outages.

	▲	richardwhiuk 7 hours ago \| parent [-]
		They are aware. They don't want to pay the cost benefit tradeoff. Education won't help - this is a very heavily argued tradeoff in every large software company.

▲

pell 9 hours ago | parent | prev [-]

I do think this is tenable as long as these services are reliable. Even though there have been some outages I would argue that they’re incredibly reliable at this point. If though this ever changes the costs to move to a competitor won’t be as simple as pushing a repository elsewhere, especially for AWS. I think that’s where some of the potential danger lies.

▲

swyx 8 hours ago | parent | next [-]

> 30 minutes of downtime

> this is tenable as long as these services are reliable

do you hear yourself, this is supposed to be a distributed CDN. imagine if HTTP had 30 minutes of downtime a year.

and judging by the HN post age, we're now past minute 60 of this incident.

▲

weird-eye-issue 8 hours ago | parent [-]

> and judging by the HN post age, we're now past minute 60 of this incident.

Huh? It's been back up during most of this time. It was up and then briefly went back down again but it's been up for a while now. Total downtime was closer to 30 minutes

▲

swyx 8 hours ago | parent [-]

twitter still down for me

▲

mrkramer 7 hours ago | parent [-]

Twitter is down while Mastodon is proudly and strongly still standing up. I knew this day would come.

	▲	swyx 5 hours ago \| parent [-]
		i also can host apps for 100 users

▲

weird-eye-issue 9 hours ago | parent | prev [-]

> especially for AWS

CF can be just as difficult if not more to migrate off of especially when using things like durable objects