RIP BCacheFS. I was hopeful I could finally have a modern filesystem in Linux mainlined (I don't trust Btrfs anymore), but I guess I'll keep on having to install ZFS for the foreseeable future I guess.

As I predicted, out of tree bcachefs is basically dead on arrival - everybody interested is already on ZFS, btrfs is still around only because ZFS can't be mainlined basically

▲ StopDisinfo910 2 days ago | parent | next [-]

> btrfs is still around only because ZFS can't be mainlined basically

ZFS is extremely annoying with the way it does extend and the fact that you can’t mismatch drive size. It’s not a panacea. There clearly is space for an improved design.

▲

cyphar 2 days ago | parent | next [-]

This is being worked on (they call it AnyRaid), the work is being sponsored by HexOS[1].

[1]: https://hexos.com/blog/introducing-zfs-anyraid-sponsored-by-...

▲

nubinetwork 2 days ago | parent | prev [-]

Underprivision your disks, then you don't have to worry about those edge cases...

▲

StopDisinfo910 2 days ago | parent [-]

If you need to consider how to buy your drives so you can use a filesystem, that’s a flaw of said filesystem not an edge case.

It clearly is an acceptable one for a lot of people but it does leave space for alternative designs.

▲

estimator7292 2 days ago | parent [-]

This is the way it's always been. RAID can't really handle mismatched drives either and you must consider that when purchasing drives. It's not a flaw, it's a consequence of geometry

▲

StopDisinfo910 2 days ago | parent [-]

It’s strange you say that because my btrfs array handles mismatched sizes just fine.

	▲	Elhana 2 hours ago \| parent [-]
		Let's say you can put a hdd and ssd of different sizes in raid1, but doesnt mean you should.

▲ sureglymop 2 days ago | parent | prev | next [-]

I've never had any issues with either ZFS or Btrfs after 2020. I wonder what you all are doing to have such issues with them.

▲ pantalaimon 2 days ago | parent | next [-]

One lovely experience I had when trying to remove a failing disk from my array was that the `btrfs device remove` failed with an I/O error - because the device was failing.

I then had to manually delete the file with the I/O error (for which I had to resovle the inode number it barfed into dmesg) and try again - until the next I/O error.

(I'm still not sure if the disk was really failing. I did a full wipe afterwards and a full read to /dev/null and experienced no errors - might have just been the meta-data that was messed up)

▲ Volundr 2 days ago | parent | prev | next [-]

Pre-2020, but I had a BTRFS filesystem with over 40% free space start failing on all writes, including deletes, with a "no space left on device error". Took the main storage array for our company offline for over a day while we struggled to figure out wtf was going on. Supposedly this is better now, but basically BTRFS marks blocks as data or metadata, and once marked a block will be reassigned (without a rebalance). Supposedly this is better now, but this was after it had been stable for a few years. After that and some smaller foot guns, I'll never willingly run BTRFS on a critical system.

▲ pizza234 2 days ago | parent | prev | next [-]

Just a few days ago I've had a checksum mismatch on a RAID-1 setup, on the metadata in both devices, which is very confusing.

Over the last one or two years I've experienced twice a checksum mismatch on the file storing the memory of a VMWare Workstation virtual machine.

Both are very likely bugs in Btrfs, and it's very unlikely that have been caused by the user (me).

In the relatively far past (around 5 years ago), I've had the system (root being on Btrfs) turning unbootable for no obvious reason, a couple of times.

▲ patrakov 2 days ago | parent | prev | next [-]

I still have a btrfs with a big problem: more disk space used than expected. The explanation was helpfully provided by btdu:

    Despite not being directly used, these blocks are kept (and cannot be reused) because another part of the extent they belong to is actually used by files.
    
    This can happen if a large file is written in one go, and then later one block is overwritten - btrfs may keep the old extent which still contains the old copy of the overwritten block.

	▲	Elhana 2 hours ago \| parent [-]
		With a clever script doing writes, deletes repeatedly, you can likely bring down any system with btrfs and it will bypass quotas.

▲ jamesnorden 2 days ago | parent | prev [-]

Ah yes, the famous "holding it wrong".

▲

happymellon 2 days ago | parent | next [-]

I've also not had issues with BTRFS.

The question was around usage, because without knowing people's usecases and configurations it'll never be usable for you while working fine for others.

▲

pizza234 2 days ago | parent [-]

If 1% of the users report a given issue (say, data corruption), the fact that 99% of the users report that they don't experience it, doesn't mean that the issue is not critical.

▲

izacus 2 days ago | parent | next [-]

The fact that you see an issue reported loudly on social media it doesn't mean it's critical or more common than for other FSes.

As usual with all these Linux debates, there's a loud group grinding their old hatreds that can be decade old.

▲

const_cast a day ago | parent | prev | next [-]

The problem is every filesystem can experience data corruption. That doesn't tell us anything about how it relates to BTRFS.

Also, filesystems just work. Nobody is gonna say "oh I'm using fileystem X and it works!" because that's the default. So, naturally, 99% of the stuff you'll hear about filesystems is when they don't work.

Don't believe me? Look up NTFS and read through reddit or stackexchange or whatever. Not a lot of happy campers.

▲

eptcyka 13 hours ago | parent | next [-]

I had a power failure and I lost the whole filesystem. Never happened with ext4 - I've had data loss after a power failure with other filesystems, but never an issue where I wasn't able to mount it and lost 100% of my data.

▲

koverstreet a day ago | parent | prev [-]

Do you see the same reports about ext4 or XFS?

I don't.

▲

happymellon a day ago | parent [-]

https://forum.endeavouros.com/t/6-1-64-1-lts-kernel-linux-lt...

It's also the reason I am completely against the Debian "backporting" methodology to pretend they aren't using the new version of something.

	▲	koverstreet a day ago \| parent [-]
		yeah there's no real QA process for the stable trees

▲

happymellon 2 days ago | parent | prev [-]

> If 1% of the users report a given issue (say, data corruption

If 0.1% of users say it corrupted for them, and then don't provide any further details and no one can replicate their scenario then it does make it hard to resolve it

▲

koverstreet a day ago | parent [-]

the btrfs devs are also famous for being unresponsive to these sorts of issues.

there's a feedback effect: if users know that a filesystem takes these kinds of issues seriously and will drop what they're doing and jump on them, a lot of users will very happily spend the time reporting bugs and working with devs to get it resolved.

people don't like wasting their time on bug reports that go into the void. they do like contributing their time when they know it's going to get their issue fixed and make things better for everyone.

this is why I regularly tell people "FEED ME YOUR BUG REPORTS! I WANT THEM ALL!"

it's just what you have to do if you want your code to be truly bulletproof.

	▲	jcalvinowens 15 hours ago \| parent [-]
		> the btrfs devs are also famous for being unresponsive to these sorts No, Kent, they are not. Posting attacks like this without evidence is cowardly and dishonest. I'm not going to tolerate these screeds from you about people I've worked with and respect without calling you out. Every time you spew this toxicity, a chunk of bcachefs users reformat and walk away. Very soon, you'll have none left.

▲

ziml77 2 days ago | parent | prev | next [-]

If you complain a knife crushing your food because you're holding it upside down, it's good for everyone else to know that context. Because anyone who is using it with the sharp side down can safely ignore that problem rather than being scared away due to an issue they won't experience.

▲

metadat 2 days ago | parent | prev | next [-]

I've experienced unrecoverable corruption with btrfs within the past 2 years.

▲

motorest 2 days ago | parent | prev [-]

> Ah yes, the famous "holding it wrong".

Is it wrong to ask how to reproduce an issue?

▲ koverstreet a day ago | parent | prev | next [-]

The community is still growing (developers too!), and people have been jumping in to help out with getting DKMS support into the distros.

bcachefs isn't going away.

The SuSE guy also reversed himself after I asked; Debian too, so we have time to get the DKMS packages out.

▲ accelbred 2 days ago | parent | prev | next [-]

I switched to ZFS for a while but had to switch back because of how much was broken. Overlayfs had issues, reflinks didn't work, etc. Linux-specific stuff that just works on kernel filesystems was missing or buggy. I saw later they added support for some of the missing features but they had data corruption issues. Also I doubt it'll ever support fs-verity.

I don't plan on giving ZFS or other filesystems not designed for Linux another go.

▲ EspadaV9 a day ago | parent | prev | next [-]

Might be harder to keep running ZFS on Linux after 6.18

https://www.phoronix.com/news/Linux-6.18-write-cache-pages

	▲	qalmakka 2 hours ago \| parent [-]
		Killing ZFS on Linux would basically make Linux unsuitable for lots of usecases. What would you use instead? Btrfs, which keeps having stupid data corruption issues? Bcachefs, which is not yet stable and now it's being struck out of the kernel? LVM2+thin provisioning, which will happily eat your data if your data overlap? I hope some industrial players will force the kernel to drop this nonsense. Heck no native filesystem besides btrfs has compression, I'm saving HUNDREDS of GB with zstd compression on my machines with basically zero overhead

▲ Ygg2 2 days ago | parent | prev | next [-]

Wait. You don't trust Btrfs but you would trust BCacheFS, that's obviously very experimental?

▲

phire 2 days ago | parent | next [-]

Btrfs claims to be stable. IMO, it's not.

It's generally fine if you stay on the happy path. It will work for 99% of people. But if you fall off that happy path, bad things might happen and nobody is surprised. In my personal experience, nobody associated with the project seems to trust a btrfs filesystem that fell off the happy path, and they strongly recommend you delete it and start from scratch. I was horrified to discover that they don't trust fsck to actually fix a btrfs filesystem into a canonical state.

BCacheFS had the massive advantage that it knew it was experimental and embraced it. It took measures to keep data integrity despite the chaos, generally seems to be a better design and has a more trustworthy fsck.

It's not that I'd trust BCacheFS, it's still not quite there (even ignoring project management issues). But my trust for Btrfs is just so much lower.

	▲	ahartmetz 2 days ago \| parent [-]
		btrfs seems to be a wonky, ill-considered design with ten years of hotfixes. bcachefs seems to be a solid design that is (or has been, it's mostly done) regularly improved where trouble was found. Now it's just fixing basically little coding oversights. In two years, I will trust bcachefs to be a much more reliable filesystem than btrfs.

▲

rurban 2 days ago | parent | prev [-]

Still more stable than btrfs. btrfs is also dead slow

	▲	Iridiumkoivu 2 days ago \| parent [-]
		I agree with this sentiment. Btrfs has destroyed itself on my testing/lab machines three times during last two years up to point where recovery wasn’t possible. Metadata corruption being main issue (or that’s how it looks like to me at least). As of now I trust BCacheFS way more. I’ve given it roughly the same time to prove itself as Btrfs too. BCacheFS has issues but so far I’ve managed to resolve them without major data loss. Please note that I currently use ext4 in all ”really important” desktop/laptop installations and OpenZFS in my server. Performance being the main concern for desktop and reliability for server.

▲ kiney 2 days ago | parent | prev [-]

btrfs has many technical advantages over zfs

▲

debazel 2 days ago | parent | next [-]

Yes, like destroying itself and losing all data.

▲

natebc 2 days ago | parent [-]

ZFS is perfectly capable of this too.

source: worked as a support engineer for a block storage company, witnessed hundreds of customers blowing one or both of their feet off with ZFS.

▲

hebocon 2 days ago | parent | next [-]

To what extent are these customers blaming the hammer for hitting their thumb?

(Legitimate question: I manage several PB with ZFS and would like to know where I should be more cautious.)

▲

natebc 2 days ago | parent | next [-]

A great deal. Which is why my cringe reflex still activates when I read about people running ZFS in places that aren't super tightly configured. ZFS is just such a massively complex piece of software.

There were legitimate bugs in ZFS that we hit. Mostly around ZIL/SLOG and L2ARC and the umpteen million knobs that one can tweak.

▲

TheNewsIsHere 2 days ago | parent | next [-]

Customers blowing off their feet with ZFS because they felt the need to tweak tunables they didn’t need to use, or didn’t properly understand, is not the fault of ZFS though.

You can do the same with just about any file system. In the Windows world you can blow your feet off with NTFS configuration too.

Of course there have been bugs, but every filesystem has had data-impacting bugs. Redundancy and backups are a critical caveat for all file systems for a reason. I once heard it said that “you can always afford to lose the data you don’t have backed up”. I do not think that broadly applies (such as with individuals), but it certainly applies in most business contexts.

▲

natebc 2 days ago | parent [-]

Yeah, my reaction to it usually that's so quickly recommended so frequently for general use.

Obviously there's footguns in everything. Filesystem ones are just especially impactful.

	▲	TheNewsIsHere a day ago \| parent [-]
		Yep. I use ZFS at home, but on business oriented NAS hardware with drives to match (generally). And I don’t go asking it to do odd things or configure it bizarrely. I don’t pass through drives named with Linux names (I prefer WWN to PCI address naming, at least at home). Etc. But a lot of people out there will slap a bunch of USB 2.0 hard drives on top of an old gaming computer. I’m all for experimenting, and I sympathize that it’s expensive to run ZFS on “ZFS class” platforms and hardware. I don’t begrudge others that. It would be really nice if there was something like ZFS that was a tad more flexibility and right in the kernel with consistent and concise user space tooling. Not everyone is comfortable with DKMS.

▲

motorest 2 days ago | parent | prev [-]

> A great deal. Which is why my cringe reflex (...)

Can you provide some specifics? So far all I see is vague complains with no substance, and when complainers are lightly pressed they go defensive.

▲

natebc 2 days ago | parent [-]

I don't have specifics for how many people running a fork of ZFS on Linux (or the fork for opensolaris, nexenta, etc) have copy-pasted some configuration from a wiki/forum/stackexchange and resulted in a pool that's misconfigured in some subtly fatal way. I don't have any personal anecdotes to share about my own homelab or enterprise IT experience with ZFS because I don't use it at home and nowhere I've worked in IT has used it.

I did live specific situations over several years in a support engineer role where a double digit percentage of customers in enterprise configurations that ended up somewhere between terrible performance and catastrophic data loss due to the misunderstood configuration of a very complex piece of software.

If you wanna use ZFS, use ZFS. I'm not the internets crusader against it. I have no doubt there's thousands of PB out there of perfectly happy, well configured and healthy zpools. It has some truely next-gen features that are extremely useful. I've just seen it recommended so, so many times as a panacea when something simpler would be just as safe and long lasting.

It's kinda like using Kubernetes to run a few containers. Right?

▲

motorest a day ago | parent [-]

> I don't have specifics (...). I don't have any personal anecdotes (...).

I see.

> I did live specific situations over several years in a support engineer role where a double digit percentage of customers in enterprise configurations that ended up somewhere between terrible performance and catastrophic data loss due to the misunderstood configuration of a very complex piece of software.

I'm sorry, but this claim is outright unbelievable. If the project was even half as unstable as you claim to be, no one would ever use it in production at all. Either you are leaving out critical details such as non-standard patches and usages that have no relationship with real world usage, or you are fabricating tales.

Also, it's telling that no one stepped forward to offer any concrete details and specifics on these hypothetical issues. Makes you think.

	▲	natebc 10 hours ago \| parent [-]
		Well, I assure you I'm not making it up. If you can't believe that people will misconfigure complicated systems that almost no single person can completely understand or that working in the storage industry exposes you to bizarre and interesting failures of both hardware and software (and firmware!) then I'm not sure what I can say to have you take a story at face value. I'm not being melodramatic. You can take a story or leave it. I'm not here to convince you one way or another. And frankly I don't particularly appreciate being called a liar, btw. Nice use of quoting also. Good day.

▲

nubinetwork 2 days ago | parent | prev [-]

Pool feature mismatch on send receive, dedup send receive, new features breaking randomly on bleeding edge releases

	▲	TheNewsIsHere 2 days ago \| parent [-]
		The intent of feature flags in ZFS is to denote changes in on-disk structures. Replication isn’t supported between pools that don’t support the same flags because otherwise ZFS couldn’t read the data from disk properly on the receiving sides. There are workarounds, with their respective caveats and warnings.

▲

throw0101a 2 days ago | parent | prev [-]

> source: worked as a support engineer for a block storage company, witnessed hundreds of customers blowing one or both of their feet off with ZFS.

The phrasing of this tends me to believe that the customers set up ZFS in a 'strange' (?) way. Or was this a bug(s) with-in ZFS itself?

Because when people talk about Btrfs issues, they are talking about the code itself and bugs that cause volumes to go AWOL and such.

(All file systems have foot-guns.)

	▲	natebc 2 days ago \| parent [-]
		Mostly customers thinking they fully understand the thousands of parameters in ZFS. There was a _very_ nasty bug in the ZFS L2ARC that took out a few PB at a couple of large installations. This was back in 2012/2013 when multiple PBs was very expensive. Was a case of ZFS putting data from the ARC into the pool after the ZIL/SLOG had been flushed.

▲

crest 2 days ago | parent | prev [-]

Can you give an example because to me it always appeared as NIH copy-cat fs?