▲ | nullc 3 days ago | ||||||||||||||||
It's pretty straight forward to use a normal checksum to correct single or even more bit errors (depending on the block size, choice of checksum, etc). Though I expect those bit errors are bus/ram, and hopefully usually transient. If there is corruption on the media, the whole block is usually going to be lost because any corruptions means that its internal block level FEC has more errors than it can fix. I was more thinking along the lines of adding dozens or hundreds of correction blocks to a whole file, along the lines of par (though there are much faster techniques now). | |||||||||||||||||
▲ | koverstreet 3 days ago | parent [-] | ||||||||||||||||
You'd think that, wouldn't you? But there are enough moving parts in the IO stack below the filesystem that we do see bit errors. I don't have enough data to do correlations and tell you likely causes, but they do happen. I think SSDs are generally worse than spinning rust (especially enterprise grade SCSI kit), the hard drive vendors have been at this a lot longer and SSDs are massively more complicated. From the conversations I've had with SSD vendors, I don't think they've put the some level of effort into making things as bulletproof as possible yet. | |||||||||||||||||
|