| ▲ | basilikum 3 hours ago | |
This is the kind of system failure that we need really good and well tested disaster recovery plans for. While not necessary this time, DENIC and any critical infrastructure provider should be able to rebuild their entire infrastructure from scratch in a tolerable amount of time (Rather days than hours in the case of a full rebuild). Importantly the disaster recovery plan has to work without reliance on either the system that is failing, but also on adjacent systems that might have hidden dependencies on the failing system. I'm really not too close to Denic and know nothing about their internals, but just close enough to have experienced the stress of someone working for DENIC second hand during the outage. From the very limited information I happened to gather DENIC had some trouble in addressing the issue because, surprise, infrastructure that they need to do so runs on de domains. [1] I'm convinced there are all kinds of extended cyclic decencies between different centralization points in the net. If some important backbone of the internet is down for an extended time, this will absolutely cause cascading failures. And thesw central points of failure are only getting worse. I love Let's Encrypt, but if something causes them to hard fail things will go really bad once certificates start to expire. We need concrete plans to cold start extended parts of the internet. If things go really bad once and communication lines start to fail, we're in for a bad time. Maybe governments have redundant, ultra resistant, low tech communication lines, war rooms and a list of important people in the industry who they can find and put in these war rooms so they can coordinate the rebuild of infrastructure. But I doubt it. [^1] I don't know if there is some kind of disaster plan in the drawer at DENIC that would address this. I don't mean to allege anything against DENIC specifically, but broadly speaking about companies and infrastructure providers, I would not be surprised if there was absolutely no plan on what to do if things really go down and how to cold start cyclic dependencies or where they even are. | ||