Remix.run Logo
Hakkin 9 hours ago

A scrub only reads allocated space, so in your 10TB example, a scrub would only read whatever portion of that 10TB is actually occupied by data. It's also usually recommended to keep your usage below 80% of the total pool size to avoid performance issues, so the worst case in your scenario would be more like ~53% assuming you follow the 80% rule.

kanbankaren 29 minutes ago | parent | next [-]

Still 53% of the useful life of a HDD for just scrubbing is excessive.

You don't lose tracks in 3 months. If you don't read the tracks for a year and if the HDD is operated in high temperatures, then the controller might struggle to read them.

The very act of scrubbing generates heat, so we should use it sparingly.

formerly_proven 8 hours ago | parent | prev [-]

Is the 80% rule real or just passed down across decades like other “x% free” rules? Those waste enormous amounts of resources on modern systems and I kind of doubt ZFS actually needs a dozen terabytes or more of free space in order to not shit the bed. Just like Linux doesn’t actually need >100 GB of free memory to work properly.

magicalhippo 2 hours ago | parent | next [-]

> Is the 80% rule real or just passed down across decades like other “x% free” rules?

As I understand it, the primary reason for the 80% was that you're getting close to another limit, which IIRC was around 90%, where the space allocator would switch from finding a nearby large-enough space to finding the best-fitting space. This second mode tanks performance and could lead to much more fragmentation. And since there's no defrag tool, you're stuck with that fragmentation.

It has also changed, now[1] the switch happens at 96% rather than 90%. Also the code has been improved[2] to better keep track of free space.

However, performance can start to degrade before you reach this algorithm switch[3], as you're more likely to generate fragmentation the less free space you have.

However, it was also a generic advice, which was ignorant to your specific workload. If you have a lot of cold data, low churn but it's fairly equal in size, then you're probably less affected than if you have high churn with lots of files of varied sizes.

[1]: https://openzfs.github.io/openzfs-docs/Performance%20and%20T...

[2]: https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFra...

[3]: https://www.bsdcan.org/2016/schedule/attachments/366_ZFS%20A...

barrkel 6 hours ago | parent | prev | next [-]

In practice you see noticeable degradation of performance for streaming reads of large files written after 85% or so. Files you used to be able to expect to get 500+MB/sec could be down to 50MB/sec. It's fragmentation, and it's fairly scale invariant, in my experience.

cornonthecobra 6 hours ago | parent | prev [-]

Speaking strictly about ZFS internal operations, the free space requirement is closer to 5% on current ZFS versions. That allows for CoW and block reallocations in real-world pools. Heavy churn and very large files will increase that margin.