It’s that time of the year again where we all realize that relying on AWS and Cloudflare to this degree is pretty dangerous but then again it’s difficult to switch at this point.

If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

▲

isodev 9 hours ago | parent | next [-]

Unless you’re say at airport trying to file a luggage claim … or at the pharmacy trying to get your prescription. I think as a community we have a responsibility to do better than this.

▲

ChrisMarshallNY 9 hours ago | parent | next [-]

> I think as a community we have a responsibility to do better than this.

I have always felt so, but my opinion is definitely in the minority.

In fact, I find that folks have extremely negative responses to any discussion of improving software Quality.

▲

abustamam 4 hours ago | parent | next [-]

I always see such negative responses when HN brings up software bloat ("why is your static site measured in megabytes").

Now that we have an abundance of compute and most people run devices more powerful than the devices that put man on the moon, it's easier than ever to make app bloat, especially when using a framework like Electron or React Native.

People take it personally when you say they write poor quality software, but it's not a personal attack, it's an observation of modern software practices.

And I'm guilty of this, mainly because I work for companies that prioritize speed of development over quality of software, and I suspect most developers are in this trap.

	▲	ChrisMarshallNY 2 hours ago \| parent [-]
		What I find annoying, is people making fun of folks that choose to “roll their own.” The typical argument that I see, is homemade encryption, which is quite valid. However, encryption is just a tiny corner of the surface. Most folks don’t want to haul in 1MB of junk, just so they can animate a transition. Well, I guess I should qualify that: Most normal folks wouldn't want to do that, but, apparently, it's de rigueur for today's coders.

▲

mosura 8 hours ago | parent | prev [-]

Merely reducing external dependencies causes people to come out in rashes.

A large proportion of “developers” enjoy build vs buy arguments far too much.

▲

sigilis 9 hours ago | parent | prev [-]

You aren’t cloudflare’s customer in these examples. It depends on the companies that are actually paying for and using the service to complain. Odds are that they won’t care on your behalf due to how our society is structured.

Not really sure how our community is supposed to deal with this.

	▲	isodev 9 hours ago \| parent [-]
		“We” are the ones making the architecture and the technical specs of these services. Taking care for it to still work when your favourite FAANGMC is down seems like something we can help with.

▲

dlisboa 9 hours ago | parent | prev | next [-]

> If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

Which only shows that chasing five 9s is worthless for almost all web products. The idea is that by relying on AWS or Cloudflare you can push your uptime numbers up to that standard, but these companies themselves are having such frequent outages that customers themselves don't expect that kind reliability from web products.

▲

tommica 9 hours ago | parent | prev | next [-]

> It’s that time of the year again

It's monthly by now

▲

lbreakjai 9 hours ago | parent | prev | next [-]

If I choose AWS/cloudflare and we're down with half of the internet, then I don't even need to explain it to my boss' bosses, because there will be an article in the mainstream media.

If I choose something else, we're down, and our competitors aren't, then my overlords will start asking a lot of questions.

▲

stevepotter 9 hours ago | parent | next [-]

Yup. AWS went down at a previous job and everyone basically took the day off and the company collectively chuckled. Cloudflare is interesting because most execs don’t know about it so I’d imagine they’d be less forgiving. “So what does cloudflare do for us exactly? Don’t we already have aws?”

▲

jfengel 9 hours ago | parent | prev | next [-]

And if everyone else is down, and you are not, you will get no credit.

▲

lbreakjai 8 hours ago | parent | next [-]

Or _you_ aren't down, but a third-party you depend on is (auth0, payment gateway, what have you), and you invested a lot of time and effort into being reliable, but it was all for less than nothing, because your website loads but customers can't purchase, and they associate the problem with you, not with the AWS outage.

	▲	8 hours ago \| parent [-]
		[deleted]

▲

trollbridge 8 hours ago | parent | prev [-]

Right. Whereas if we get whacked with a random DDoS, that's my fault.

▲

timeon 9 hours ago | parent | prev [-]

In reality it is not half of the internet. That is just marketing. I've personally noticed one news site while others were working. And I guess sites like that will get the blame.

▲

fusl 9 hours ago | parent | prev | next [-]

Happy to hear anyone's suggestions about where else to go or what else to do in regards to protecting from large-scale volumetric DDoS attacks. Pretty much every CDN provider nowadays has stacked up enough capacity to tank these kind of attacks, good luck trying to combat these yourself these days?

▲

trollbridge 8 hours ago | parent | next [-]

Somehow KiwiFarms figured it out with their own "KiwiFlare" DDOS mitigation. Unfortunately, all of the other Cloudflare-like services seem exceptionally shady, will be less reliable than Cloudflare, and probably share data with foreign intelligence services I have even less trust for than the ones Cloudflare possibly shares them with.

▲

isodev 9 hours ago | parent | prev | next [-]

Anubis and/or Bunny are good alternatives/combination depending on your exact needs

- https://anubis.techaro.lol/

- https://bunny.net/

▲

fusl 9 hours ago | parent | next [-]

Unfortunately Anubis doesn't help where my pipe to the internet isn't fat enough to just eat up all the bandwidth that the attacker has available. Renting tens of terabits of capacity isn't cheap and DDoS attacks nowadays are in the scale of that. BunnyCDN's DDoS protection is unfortunately too basic to filter out anything that's ever so slightly more sophisticated. Cloudflare's flexibility in terms of custom rulesets and their global pre-trained rulesets (based on attacks they've seen in the past) is imo just unbeatable at this time.

	▲	isodev 9 hours ago \| parent \| next [-]
		The Bunny Shield is quite similar to the Cloudflare setup. Maybe not 100% overlap of features but unless you’re Twitter or Facebook, it’s probably enough. I think at the very least, one should plan the ability to switch to an alternative when your main choice fails… which together with AWS and GitHub is a weekly event now.
	▲	immibis an hour ago \| parent \| prev [-]
		We live in the world of mass internet surveillance. DDoS like this are not very common, partly because people who do it keep going to jail.

▲

Doman 9 hours ago | parent | prev | next [-]

bunny.net is not reachable for me too... really funny

https://imgur.com/a/8gh3hOb

	▲	isodev 5 hours ago \| parent \| next [-]
		All the edges are gone! :)
	▲	haar 8 hours ago \| parent \| prev [-]
		I clicked the image thinking I was seeing the message you were getting (geoblocked in the UK), then realised I'd clicked an imgur link :facepalm: (Note: Zero negative sentiment towards imgur here)

▲

7 hours ago | parent | prev | next [-]

[deleted]

▲

RKFADU_UOFCCLEL 7 hours ago | parent | prev [-]

Why do people on a technical website suggest this? It's literally the same snake oil as Cloudflare. Both have an endgame of total web DRM; they want to make sure users "aren't bots". Each time the DRM is cracked, they will increase its complexity of the "verifier". You will be running arbitrary code in your big 4 browser to ensure you're running a certified big 4 browser, with 10 trillion man hours of development, on an certified OS.

▲

callalex 4 hours ago | parent [-]

Because there is a real problem that needs to be solved one way or another.

	▲	RKFADU_UOFCCLEL 28 minutes ago \| parent [-]
		Anubis doesn't solve anything, bud.

▲

bandrami 8 hours ago | parent | prev | next [-]

Is a DDOS more frequent and/or worse than stochastic CDN outages?

▲

q3k 9 hours ago | parent | prev [-]

Just accept that a DDoS might happen and that there's nothing you can do about it. It's fine, it's just how the Internet works.

▲

herbst 9 hours ago | parent | next [-]

That was possible when a DDos was usually still an occasional attack by a bad actor.

Most time I get ddosed now it's either Facebook directly, Something something Azure or any random AI.

▲

q3k 9 hours ago | parent | next [-]

That sounds like an app-level (D)DoS, which is generally something you can mitigate yourself.

▲

7 hours ago | parent | next [-]

[deleted]

▲

geerlingguy 8 hours ago | parent | prev [-]

It's harder when it's a new group of IPs and happens 2-3x every month.

	▲	herbst 7 hours ago \| parent [-]
		And if you do rule based blocking they just change their approach. I am constantly blocking big corps these days, barely any work with normal bad actors. And lots of real users time wasted for captchas.

▲

nhecker 8 hours ago | parent | prev [-]

How (or to what end) would Facebook want to directly DoS someone?

	▲	herbst 7 hours ago \| parent \| next [-]
		What do they even have an spider for? I never saw any actual traffic with source Facebook. I don't understand either, but it's their official IPs, their official bot headers and it behaves exactly like someone who wants my sites down. Does it make sense? Nah, but is it part of the weird reality we live in. Looks like it I have no way of contacting Facebook. All I can do is keep complaining on hackernews whenever the topic arrises. Edit:// Oh and I see the same with Azure, however there I have no list of IPs to verify it's official just because it looks like it.
	▲	inferiorhuman 7 hours ago \| parent \| prev [-]
		I got DoS'd by them once, email not HTTP traffic though. Quick slip of their finger and bam low cost load testing.

▲

peanut-walrus 9 hours ago | parent | prev [-]

So accept that your customers won't be able to use your services whenever some russian teenager is bored? Yeah, good luck with justifying that choice.

▲

q3k 9 hours ago | parent [-]

And how often does that happen?

▲

peanut-walrus 8 hours ago | parent [-]

For the service I'm responsible for, 4 times in the last 24 hours.

	▲	q3k 8 hours ago \| parent [-]
		Congratulations, you're the exception rather than the norm.

▲

weird-eye-issue 9 hours ago | parent | prev [-]

Oh no, we had 30 minutes of downtime this year :(

▲

CableNinja 8 hours ago | parent | next [-]

5 9's is like 7 minutes a year. They are breaking SLAs and impacting services people depend on

Tbh though this is sort of all the other companies fault, "everyone" uses aws and cf and so others follow. now not only are all your chicks in one basket, so is everyone elses. When the basket inevitably falls into a lake....

Providers need to be more aware of their global impact in outages, and customers need to be more diverse in their spread.

▲

world2vec 8 hours ago | parent | next [-]

99.999% availability is around 5 minutes or so of downtime per year.

▲

weird-eye-issue 8 hours ago | parent | prev [-]

> Providers need to be more aware of their global impact in outages

So you think the problem is they aren't "aware"?

▲

CableNinja 8 hours ago | parent [-]

These kinds of outages continue to happen and continue to impact 50+% of the internet, yes, they know they have that power, but they dont treat changes as such, so no, they arent aware. Awareness would imply more care in operations like code changes and deployments.

Outages happen, code changes occur; but you can do a lot to prevent these things on a large scale, and they simply dont.

Where is the A/B deployment, preventing a full outage? What about internally, where was the validation before the change, was the testing run against a prodlike environment or something that once resembled prod but hasnt forever?

They could absolutely mitigate impacting the entire global infra in multiple ways, and havent, despite their many outages.

	▲	richardwhiuk 7 hours ago \| parent [-]
		They are aware. They don't want to pay the cost benefit tradeoff. Education won't help - this is a very heavily argued tradeoff in every large software company.

▲

pell 9 hours ago | parent | prev [-]

I do think this is tenable as long as these services are reliable. Even though there have been some outages I would argue that they’re incredibly reliable at this point. If though this ever changes the costs to move to a competitor won’t be as simple as pushing a repository elsewhere, especially for AWS. I think that’s where some of the potential danger lies.

▲

swyx 8 hours ago | parent | next [-]

> 30 minutes of downtime

> this is tenable as long as these services are reliable

do you hear yourself, this is supposed to be a distributed CDN. imagine if HTTP had 30 minutes of downtime a year.

and judging by the HN post age, we're now past minute 60 of this incident.

▲

weird-eye-issue 8 hours ago | parent [-]

> and judging by the HN post age, we're now past minute 60 of this incident.

Huh? It's been back up during most of this time. It was up and then briefly went back down again but it's been up for a while now. Total downtime was closer to 30 minutes

▲

swyx 8 hours ago | parent [-]

twitter still down for me

▲

mrkramer 7 hours ago | parent [-]

Twitter is down while Mastodon is proudly and strongly still standing up. I knew this day would come.

	▲	swyx 5 hours ago \| parent [-]
		i also can host apps for 100 users

▲

weird-eye-issue 9 hours ago | parent | prev [-]

> especially for AWS

CF can be just as difficult if not more to migrate off of especially when using things like durable objects