Remix.run Logo
jtickle 2 days ago

All of the "btrfs eats your data" bugs have been fixed and the people who constantly repeat them are people who relied on an experimental filesystem for files they cared not to lose. FUD all around. I have a btrfs on my home file server that's been running just fine for almost 10 years now and has survived the initial underlying hard drives mechanical death. Since then I have used it in plenty of production environments.

Don't do RAID 5. Just don't. That's not just a btrfs shortcoming. I lost a hardware RAID 5 due to "puncture" which would have been fascinating to learn about if it hadn't happened to a production database. It's an academically interesting concept but it is too dangerous especially with how large drives are now, if you're buying three, buy four instead. RAID 10 is much safer especially for software RAID.

Stop parroting lies about btrfs. Since it became marked stable, it has been a reliable, trustworthy, performant filesystem.

But as much as I trust it I also have backups because if you love your data, it's your own fault if you don't back it up and regularly verify the backups.

plqbfbv 2 days ago | parent | next [-]

> All of the "btrfs eats your data" bugs have been fixed ... I have a btrfs on my home file server that's been running just fine for almost 10 years now and has survived the initial underlying hard drives mechanical death

In the last 10 years, btrfs:

1. Blew up three times on two unrelated systems due to internal bugs (one a desktop, one a server). Very few people were/are aware of the remount-only-once-in-degraded "FEATURE" where if a filesystem crashed, you could mount with -odegraded exactly only once, then the superblock would completely prevent mounting (error: invalid superblock). I'm not sure whether that's still the case or whether it got fixed (I hope so). By the way, these were on RAID1 arrays with 2 identical disks with metadata=dup and data=dup, so the filesystem was definitely mountable and usable. It basically killed the usecase of RAID1 for availability reasons. ZFS has allowed me to perform live data migrations while missing one or two disks across many reboots.

2. Developers merged patches to mainline, later released to stable, that completely broke discard=async (or something similar) which was a supported mount option from the manpages. My desktop SSD basically ate itself, had to restore from backups. IIRC the bug/mailing list discussions I found out later were along the lines of "nobody should be using it", so no impact.

3. Had (maybe still has - haven't checked) a bug where if you fill the whole disk, and then remove data, you can't rebalance, because the filesystem sees it has no more space available (all chunks are allocated). The trick I figured out was to shrink the filesystem to force data relocation, then re-expand it, then balance. It was ~5 years ago and I even wrote a blog post about it.

4. Quota tracking when using docker subvolumes is basically unusable due to the btrfs-cleaner "background" task (imagine VSCode + DevContainers taking 3m on a modern SSD to cleanup 1 big docker container). This is on 6.16.

5. Hit a random bug just 3 days ago on 6.16, where I was doing periodic rebalancing and removing a docker subvolume. 200+ lines of logs in dmesg, filesystem "corrupted" and remounted read-only. I was already sweating, not wanting to spend hours restoring from backups, but unexpectedly the filesystem mounted correctly after reboot. (first pleasant experience in years)

ZFS in 10y+ has basically only failed me when I had bad non-ECC RAM, period. Unfortunately I want the latest features for graphics etc on my desktop and ZFS being out of tree is a no-go. I also like to keep the same filesystem on desktop and server, so I can troubleshoot locally if required. So now I'm still on btrfs, but I was really banking on bcachefs.

Oh well, at least I won't have to wait >4 weeks for a version that I can compile with the latest stable kernel.

The only stable implementation is Synology's, the rest, even mainline stable, failed on me at least once in the last 10 years.

greyw 12 hours ago | parent [-]

> Quota tracking when using docker subvolumes is basically unusable due to the btrfs-cleaner "background" task (imagine VSCode + DevContainers taking 3m on a modern SSD to cleanup 1 big docker container). This is on 6.16.

I had to disable quota tracking. It lags my whole desktop whenever that shit is running in the background. Makes it unusable on an interactive desktop.

arccy 2 days ago | parent | prev | next [-]

"performant", it's still slow if you actually use any of the advanced features like copy on write.

FirmwareBurner 2 days ago | parent [-]

Every CoW filesystem is just as slow. There's no magic pill to fix performance but it's a known tradeoff.

koverstreet a day ago | parent [-]

Not inherently.

Early bcachefs was ridiculously fast, it's gotten slower as we've grown all the features to compete with ZFS. All the database stuff that gives us amazing flexibility adds overhead (the btree iterator code has gotten fat), backpointers and modern accounting blew up our journalling overhead. A lot things that we needed for scalability, or hardening/self healing, have added overhead.

COW really isn't the main thing, it's cramming all the features in that we want these days while keeping the fastpaths fast that's the tricky part.

But, a lot of this stuff is fixable - performance just hasn't been the priority, since the actual users aren't complaining about performance and are instead clamoring for things like erasure coding.

(and, the performance numbers that I've seen comparing us to ZFS still put us _significantly faster)

betaby 2 days ago | parent | prev | next [-]

> FUD all around

????

> Don't do RAID 5.

Ah, OK, so not FUD

> Stop parroting lies about btrfs.

I seee

yjftsjthsd-h 2 days ago | parent | prev [-]

Yeah, no. I've had btrfs lose a root filesystem on a laptop with only one disk. No RAID, nothing fancy, well after it was supposed to be stable, on OpenSUSE where I assumed it would be well supported and pick good defaults.

Claiming that anyone reporting problems is lying is acting in bad faith and makes your argument weaker.

Also, "works for me" isn't terribly convincing.