Remix.run Logo
jandrewrogers 2 days ago

Actually bad news. Most popular filesystems and filesystem configurations have limited and/or weak checksums, certainly much worse than you'd want for a database. 16-bit and 32-bit CRCs are common in filesystems.

This is a major reason databases implement their own checksums. Unfortunately, many open source databases have weak or non-existent checksums too. It is sort of an indefensible oversight.

fc417fc802 a day ago | parent | next [-]

Assuming that you expect corruption to be exceedingly rare what's wrong with a 1 in 2^16 or 1 in 2^32 failure rate? That's 4 9s and 9 9s respectively for detecting an event that you hardly expect to happen in the first place.

At 32 bits you're well into the realm of tail risks which include things like massive solar flares or the data center itself being flattened in an explosion or natural disaster.

Edit: I just checked a local drive for concrete numbers. It's part of a btrfs array. Relevant statistics since it was added are 11k power on hours, 24 TiB written, 108 TiB read, and 32 corruption events at the fs level (all attributable to the same power failure, no corruption before or since). I can't be bothered to compute the exact number but at absolute minimum it will be multiple decades of operation before I would expect even a single corruption event to go unnoticed. I'm fairly certain that my house is far more likely to burn down in that time frame.

lxgr 19 hours ago | parent | prev [-]

> Most popular filesystems and filesystem configurations have limited and/or weak checksums,

Because filesystems, too, mainly use them to detect inconsistencies introduced by partial or reordered writes, not random bit flips. That's also why most file systems only have them on metadata, not data.