Remix.run Logo
saurik 18 hours ago

While those are all "filesystems", they are also (internally) alternatives to MD RAID; like, you could run zfs on top of MD RAID, but it feels like a waste of zfs (and the same largely goes for btrfs and bcachefs). It thereby is not at all clear to me that it is the filesystems that are "immune to this issue" rather than their respective RAID-like behaviors, as it seems to be the latter that the discussion was focussing on (hence the initial mention of potentially adding btrfs to the issue, which did not otherwise mention any filesystem at all). Put another way: if you did do the unusual thing of running zfs on top of MD RAID, I actually bet you are still vulnerable to this scenario.

(Oh, unless you are maybe talking about something orthogonal to the fixes mentioned in the discussion thread, such as some property of the extra checksumming done by these filesystems? And so, even if the disks de-synchronize, maybe zfs will detect an error if it reads "the wrong one" off of the underlying MD RAID, rather than ending up with the other content?)

ludocode 16 hours ago | parent | next [-]

These filesystems are not really alternatives because mdraid supports features those filesystems do not. For example, parity raid is still broken in btrfs (so it effectively does not support it), and last I checked zfs can't grow a parity raid array while mdraid can.

I run btrfs on top of mdraid in RAID6 so I can incrementally grow it while still having copy-on-write, checksums, snapshots, etc.

I hope that one day btrfs fixes its parity raid or bcachefs will become stable enough to fully replace mdraid. In the meantime I'll continue using mdraid with a copy-on-write filesystem on top.

bestham 2 hours ago | parent | next [-]

Like everything else in engineering it is a matter of trade offs. The setup you chose to run really hampers the usefulness of having a checksuming file system, since it cannot simply get the correct data from another drive. As a peer pointed out: ZFS does support adding additional drives to expand a RaidZ (with some trade offs). What you cannot do is change the raid topology at the fly.

bananapub 15 hours ago | parent | prev | next [-]

> zfs can't grow a parity raid array while mdraid can.

indeed out of date - that was merged a long time ago and shipped in a stable version earlier this year.

koverstreet 10 hours ago | parent | prev [-]

soon :)

Polizeiposaune 16 hours ago | parent | prev [-]

ZFS puts checksums in the block pointer, so, unless you disable checksums, it always knows the expected checksum of a block it is about to read.

When the actual checksum of what was read from storage doesn't match the expected value, it will try reading alternate locations (if there are any), and it will write back the corrected block if it succeeds in reconstructing a block with the expected checksum.