Remix.run Logo
spoaceman7777 20 hours ago

A solution I haven't yet seen in this thread is to buy multiple drives, and sacrifice the capacity of one of those drives to maintain single parity via a raidz1 configuration with zfs. (raidz2 or raidz3 are likely better, as you can guard against full drive failures as well, but you'd need to increase the number of drives' capacity that you're using for parity.)

zfs in these filesystem-specific parity-raid implementations also auto-repairs corrupted data whenever read, and the scrub utility provides an additional tool for recognizing and correcting such issues proactively.

This applies to both HDDs and SSDs. So, a good option for just about any archival use case.

fsckboy 20 hours ago | parent | next [-]

this is about drives that are not plugged in. are you saying parity would let you simply detect that the data had gone bad? increasing the number of drives would increase the decay rate, more possibilities for a first one to expire. if your parity drive expired first, you would think you had errors when you didn't yet.

spoaceman7777 19 hours ago | parent [-]

No, I'm talking about parity raid (raidz1/z2/z3, or, more familiarly, raid 5 and 6).

In a raidz1, you save one of the n drives' worth of space to store parity data. As long as you don't lose that same piece of data on more than one drive, you can reconstruct it when it's brought back online.

And, since the odds of losing the same piece of data on more than one drive is much lower than the odds of losing any piece of data at all, it's safer. Upping it to two drives worth of data, and you can even suffer a complete drive failure, in addition to sporadic data loss.

fweimer 20 hours ago | parent | prev [-]

How would this work? Wouldn't all these drives start loosing data at roughly at the same time?

spoaceman7777 19 hours ago | parent [-]

Yes, but different pieces of data. The stored parity allows you to reconstruct any piece of data as long as it is only lost on one of the drives (in the single parity scenario).

The odds of losing the same piece of data on multiple drives is much lower than losing any piece of data at all.

danparsonson 19 hours ago | parent [-]

But the data is not disappearing, it's corrupted - so how do you know which bits are good and which are not?