| ▲ | drob518 3 hours ago | |||||||||||||||||||||||||
Frequently, when you see these massive failures, the root cause is an alignment of small weaknesses that all come together on a specific day. See, for instance, the space shuttle O-ring incident, Three-Mile Island, Fukushima, etc. These are complex systems with lots of moving parts and lots of (sometimes independent) people managing them. In a sense, the complexity it the common root cause. | ||||||||||||||||||||||||||
| ▲ | linuxguy2 2 hours ago | parent | next [-] | |||||||||||||||||||||||||
It's like the Swiss Cheese model where every system has "holes" or vulnerabilities, several layers, and a major incident only occurs when a hole aligns through all the layers. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | roenxi an hour ago | parent | prev | next [-] | |||||||||||||||||||||||||
> See, for instance, the space shuttle O-ring incident That wasn't really a result of an alignment of small weaknesses though. One of the reasons that whole thing was of particular interest was Feynman's withering appendix to the report where he pointed out that the management team wasn't listening to the engineering assessments of the safety of the venture and were making judgement calls like claiming that a component that had failed in testing was safe. If a situation is being managed by people who can't assess technical risk, the failures aren't the result of many small weaknesses aligning. It wasn't an alignment of small failures as much as that a component that was well understood to be a likely point of failure had probably failed. Driven by poor management. > Fukushima This one too. Wasn't the reactor hit by a wave that was outside design tolerance? My memory was that they were hit by an earthquake that was outside design spec, then a tsunami that was outside design spec. That isn't a number of small weaknesses coming together. If you hit something with forces outside design spec then it might break. Not much of a mystery there. From a similar perspective if you design something for a 1:500 year storm then 1/500th of them might easily fail every year to storms. No small alignment of circumstances needed. | ||||||||||||||||||||||||||
| ▲ | amelius 3 hours ago | parent | prev [-] | |||||||||||||||||||||||||
It usually starts with a broken coffee machine. | ||||||||||||||||||||||||||