Remix.run Logo
rincebrain 5 hours ago

The two obvious examples that come to mind are native encryption bugs and spacemap issues.

Nothing about walking the entire tree of blocks and checking hashes validates the spacemaps - they only come up when you're dealing with allocating new blocks, and there have been a number of bugs where ZFS panics because the spacemaps say something insane, so you wind up needing to readonly import or discard the ZIL because it panics about trying to allocate an already-allocated segment if you import RW - and if your ondisk spacemaps are inconsistent in a way that discarding the ZIL doesn't work around, you would need some additional tool to try and repair this, because ZFS has no knobs for it.

Native encryption issues wouldn't be noticed because scrubbing doesn't attempt to untransform data blocks - you indirectly do that when you're walking the structures involved, but the L0 data blocks don't get decompressed or decrypted, since all your hashes are of the transformed blocks. And if you have a block where the hash in the metadata is correct but it doesn't decrypt, for any reason, scrub won't notice, but you sure will if you ever try to decrypt it.

mustache_kimono 4 hours ago | parent [-]

> The two obvious examples

Appreciate this rincebrain. Know that you know better than most and this certainly covers my 2nd point. I don't imagine these cases cover my first point though? These are not bugs of the type a fsck would catch?

rincebrain an hour ago | parent [-]

A fsck could pretty readily notice and repair the spacemap inconsistencies - zdb already generates its own spacemaps and compares to reality on import.

If you have the keys, technically nothing stops a fsck from noticing the encryption problems, but yes, usually it wouldn't unless you had some known issue you added special detection for, like when XFS years ago had problems with if you mounted it with inode64 once and then not the next time so the inode numbers would wraparound.