Remix.run Logo
SwiftyBug 3 days ago

I thought planes had insane redundancy exactly so stuff like that don´t happen. How can a bit flip cause the system that controls altitude to malfunction like that?

procflora 3 days ago | parent | next [-]

From what I've heard (FWIW), Airbus released a version of the software for one of the flight computers that removed SEU protections (hence grounding affected models until they could be downgraded to the previous version).

There was still hardware redundancy though. Operation of the plane's elevator switched to a secondary computer. Presumably it was also running the same vulnerable software, but they diverted and landed early in part to minimize this risk.

So not just redundancy but layers of redundancy.

willis936 3 days ago | parent | prev | next [-]

Why would you ever expect one bit flip? You have a flip rate and you design your system to tolerate a certain bit flip rate. Assumptions made during requirements establishment were wrong and nature eventually let them know they had negative margin.

p_l 3 days ago | parent | prev | next [-]

Possibility of bit flips from cosmic radiation only really came to fore in 1990s, and some aircraft and parts predate that.

15155 3 days ago | parent [-]

Smaller semiconductor feature size greatly increases the likelihood of these types of errors.

p_l 2 days ago | parent [-]

For a long time ECC brought most of effect as hedge against failing silicon, and local EMI. Aviation had benefit of careful EMI designs and appropriately selected chips, so it was seen less of a benefit...

bdangubic 3 days ago | parent | prev [-]

  if (cosmic_ray) {
     do_not_flip_bits()
  } else {
     flip_away()
  }
rjp0008 3 days ago | parent [-]

What if in the time between initialization of cosmic_ray to False, and the time this if statement executes, a legitimate cosmic ray flips the bool bit representing cosmic_ray?

sunrunner 3 days ago | parent | next [-]

This is a really good point and a common error in bit flip detection code. To avoid this kind of look-before-you-leap hazard the following is recommended:

    try {
        do_action()
    } catch (BitFlipError e) {
        logger.critical("Shouldn't get here")
    }
Ask-for-forgiveness as an error detection pattern avoids these kinds of errors entirely.
terminalshort 3 days ago | parent | prev | next [-]

Simple! Make it an int.

  int cosmic_ray = 0
  if (bool(cosmic_ray)) {
     throw cosmicRayException()
  }
wavemode 3 days ago | parent | prev [-]

ah, a classic TORTOF bug (time-of-ray, time-of-flip)