▲ | daneel_w 2 days ago | |||||||
> A significant fraction of data corruption occurs while it is being moved between storage and memory Surely you mean on the memory bus specifically? SATA and PCIe both have some error correction methods for securing transfers between storage and host controller. I'm not sure about old parallel ATA. While I understand it can happen under conditions similar to non-ECC RAM being corrupted, I don't think I've ever heard or read about a case where a storage device randomly returned erroneous data, short of a legitimate hardware error. | ||||||||
▲ | jandrewrogers 2 days ago | parent [-] | |||||||
The error correction in PCIe, SATA, etc is too weak to be reliable for modern high-performance storage. This is a known deficiency and why re-reading a corrupted page sometimes fixes things. PCIe v6 is introducing a much stronger error detection scheme to address this, which will mostly leave bit-rot on storage media as the major vector. The bare minimum you want these days is a 64-bit CRC. A strong 128-bit hash would be ideal. Even if you just apply these at the I/O boundaries then you'll catch most corruption. The places it can realistically occur are shrinking but most software makes minimal effort to detect this corruption even though it is a fairly well-bounded problem. | ||||||||
|