Remix.run Logo
KPGv2 4 hours ago

Bitflips are something that can happen in consumer-grade RAM, so that tracks (and it's comforting that wayward cosmic rays are a substantial reason for an application's crashes!), but on enterprise servers, they will run ECC RAM that is very resistant to bit flips.

This is why data hoarders who have NASes with lots of space insist on running their servers with ECC RAM despite it being significantly more expensive. Because bit flips, for all intents and purposes, cannot happen. The RAM itself detects and corrects for them.

I wouldn't expect bit flips to be a significant contributor to enterprise problems.

Anon1096 4 hours ago | parent | next [-]

Bitflips specifically may not be; things like network issues, noisy neighbors, row/rack/host maintenance (leading to a downed and migrated host) absolutely are things that happen at high frequency at scale and cause your background level of errors to be more than 0.

maccard 4 hours ago | parent | prev [-]

You've completely missed the point - It's not about bitflips it's about errors that are outside the scope of what's fixable.