Remix.run Logo
klempner 12 hours ago

>HDDs typically have a BER (Bit Error Rate) of 1 in 1015, meaning some incorrect data can be expected around every 100 TiB read. That used to be a lot, but now that is only 3 or 4 full drive reads on modern large-scale drives. Silent corruption is one of those problems you only notice after it has already done damage.

While the advice is sound, this number isn't the right number for this argument.

That 10^15 number is for UREs, which aren't going to cause silent data corruption -- simple naive RAID style mirroring/parity will easily recover from a known error of this sort without any filesystem layer checksumming. The rates for silent errors, where the disk returns the wrong data that benefit from checksumming, are a couple of orders of magnitude lower.

iberator 6 hours ago | parent | next [-]

This is pure theory. Ber shouldn't be counted per sector etc? We shouldn't tread all disk space as single entity, IMO

thfuran 4 hours ago | parent [-]

Why would that make a difference unless some sectors have higher/lower error rates than others?

Dylan16807 3 hours ago | parent [-]

For a fixed bit error rate, making your typical error 100x bigger means it will happen 100x less often.

If the typical error is an entire sector, that's over 30 thousand bits. 1:1e15 BER could mean 1 corrupted bit every 100 terabytes or it could mean 1 corrupted sector every 4 exabytes. Or anything in between. If there's any more detailed spec for what that number means, I'd love to see it.

digiown 4 hours ago | parent | prev [-]

This stat is also complete bullshit. If it were true, your scrubs of any 20+TB pool would get at least corrected errors quite frequently. But this is not the case.

The consumer grade drives are often given an even lower spec of 1 in 1e14. For a 20TB drive, that's more than one error every scrub, which does not happen. I don't know about you, but I would not consider a drive to be functional at all if reading it out in full would produce more than one error on average. Pretty much nothing said on that datasheet reflects reality.

alexfoo 2 hours ago | parent [-]

> This stat is also complete bullshit. If it were true, your scrubs of any 20+TB pool would get at least corrected errors quite frequently. But this is not the case.

I would expect the ZFS code is written with the expected BER in mind. If it reads something, computes the checksum and goes "uh oh" then it will probably first re-read the block/sector, see that the result is different, possibly re-read it a third time and if all OK continue on without even bothering to log an obvious BER related error. I would expect it only bothers to log or warn about something when it repeatedly reads the same data that breaks the checksum.

Caveat Reddit but https://www.reddit.com/r/zfs/comments/3gpkm9/statistics_on_r... has some useful info in it. The OP starts off with a similar premise that a BER of 10^-14 is rubbish but then people in charge of very large pools of drives wade in with real world experience to give more context.