Remix.run Logo
jandrewrogers 2 days ago

The error correction in PCIe, SATA, etc is too weak to be reliable for modern high-performance storage. This is a known deficiency and why re-reading a corrupted page sometimes fixes things. PCIe v6 is introducing a much stronger error detection scheme to address this, which will mostly leave bit-rot on storage media as the major vector.

The bare minimum you want these days is a 64-bit CRC. A strong 128-bit hash would be ideal. Even if you just apply these at the I/O boundaries then you'll catch most corruption. The places it can realistically occur are shrinking but most software makes minimal effort to detect this corruption even though it is a fairly well-bounded problem.

daneel_w 2 days ago | parent [-]

Thanks for the tech details.