|
| ▲ | q3k 6 hours ago | parent | next [-] |
| Only if that bitflip happens somewhere in your actual data, vs. some GPU pipeline register that then locks up the entire system until a power cycle. Or causes a wrong address to be fetched. Or causes other nasty silent errors. Or... Try doing fault injection on a chip some time. You'll see it's significantly easier to cause a crash / reset / hang than to just flip data bits. 'rad-triggered bit flips don't matter with AI' is a lie spoken by people who have obviously never done any digital design in their life. |
|
| ▲ | gbriel 6 hours ago | parent | prev | next [-] |
| As long as they stay below Van Allen belts and deal with weaker magnetic shielding in sun synchronous orbit (high latitudes). I would say they probably something a little beefier than consumer hardware and just deal with lots of failures and bit flips. But cooling is a bigger issue probably? |
|
| ▲ | ohyoutravel 6 hours ago | parent | prev [-] |
| Random bit flips might even improve output. |
| |
| ▲ | adastra22 6 hours ago | parent [-] | | Single upset events in a modern GPU are not bitflips. They destroy the surrounding circuitry and usually disable the whole unit. | | |
| ▲ | pantalaimon 6 hours ago | parent [-] | | If that happens you disable that CUDA core.
If you GPU is too damaged, you deorbit the satellite. | | |
| ▲ | adastra22 6 hours ago | parent [-] | | Yeas, and this will happen within weeks of launch with the orbits under consideration. |
|
|
|