Remix.run Logo
sureglymop 2 days ago

I've never had any issues with either ZFS or Btrfs after 2020. I wonder what you all are doing to have such issues with them.

pantalaimon 2 days ago | parent | next [-]

One lovely experience I had when trying to remove a failing disk from my array was that the `btrfs device remove` failed with an I/O error - because the device was failing.

I then had to manually delete the file with the I/O error (for which I had to resovle the inode number it barfed into dmesg) and try again - until the next I/O error.

(I'm still not sure if the disk was really failing. I did a full wipe afterwards and a full read to /dev/null and experienced no errors - might have just been the meta-data that was messed up)

Volundr 2 days ago | parent | prev | next [-]

Pre-2020, but I had a BTRFS filesystem with over 40% free space start failing on all writes, including deletes, with a "no space left on device error". Took the main storage array for our company offline for over a day while we struggled to figure out wtf was going on. Supposedly this is better now, but basically BTRFS marks blocks as data or metadata, and once marked a block will be reassigned (without a rebalance). Supposedly this is better now, but this was after it had been stable for a few years. After that and some smaller foot guns, I'll never willingly run BTRFS on a critical system.

pizza234 2 days ago | parent | prev | next [-]

Just a few days ago I've had a checksum mismatch on a RAID-1 setup, on the metadata in both devices, which is very confusing.

Over the last one or two years I've experienced twice a checksum mismatch on the file storing the memory of a VMWare Workstation virtual machine.

Both are very likely bugs in Btrfs, and it's very unlikely that have been caused by the user (me).

In the relatively far past (around 5 years ago), I've had the system (root being on Btrfs) turning unbootable for no obvious reason, a couple of times.

patrakov 2 days ago | parent | prev | next [-]

I still have a btrfs with a big problem: more disk space used than expected. The explanation was helpfully provided by btdu:

    Despite not being directly used, these blocks are kept (and cannot be reused) because another part of the extent they belong to is actually used by files.
    
    This can happen if a large file is written in one go, and then later one block is overwritten - btrfs may keep the old extent which still contains the old copy of the overwritten block.
Elhana 2 hours ago | parent [-]

With a clever script doing writes, deletes repeatedly, you can likely bring down any system with btrfs and it will bypass quotas.

jamesnorden 2 days ago | parent | prev [-]

Ah yes, the famous "holding it wrong".

happymellon 2 days ago | parent | next [-]

I've also not had issues with BTRFS.

The question was around usage, because without knowing people's usecases and configurations it'll never be usable for you while working fine for others.

pizza234 2 days ago | parent [-]

If 1% of the users report a given issue (say, data corruption), the fact that 99% of the users report that they don't experience it, doesn't mean that the issue is not critical.

izacus 2 days ago | parent | next [-]

The fact that you see an issue reported loudly on social media it doesn't mean it's critical or more common than for other FSes.

As usual with all these Linux debates, there's a loud group grinding their old hatreds that can be decade old.

const_cast a day ago | parent | prev | next [-]

The problem is every filesystem can experience data corruption. That doesn't tell us anything about how it relates to BTRFS.

Also, filesystems just work. Nobody is gonna say "oh I'm using fileystem X and it works!" because that's the default. So, naturally, 99% of the stuff you'll hear about filesystems is when they don't work.

Don't believe me? Look up NTFS and read through reddit or stackexchange or whatever. Not a lot of happy campers.

eptcyka 13 hours ago | parent | next [-]

I had a power failure and I lost the whole filesystem. Never happened with ext4 - I've had data loss after a power failure with other filesystems, but never an issue where I wasn't able to mount it and lost 100% of my data.

koverstreet a day ago | parent | prev [-]

Do you see the same reports about ext4 or XFS?

I don't.

happymellon a day ago | parent [-]

https://forum.endeavouros.com/t/6-1-64-1-lts-kernel-linux-lt...

It's also the reason I am completely against the Debian "backporting" methodology to pretend they aren't using the new version of something.

koverstreet a day ago | parent [-]

yeah there's no real QA process for the stable trees

happymellon 2 days ago | parent | prev [-]

> If 1% of the users report a given issue (say, data corruption

If 0.1% of users say it corrupted for them, and then don't provide any further details and no one can replicate their scenario then it does make it hard to resolve it

koverstreet a day ago | parent [-]

the btrfs devs are also famous for being unresponsive to these sorts of issues.

there's a feedback effect: if users know that a filesystem takes these kinds of issues seriously and will drop what they're doing and jump on them, a lot of users will very happily spend the time reporting bugs and working with devs to get it resolved.

people don't like wasting their time on bug reports that go into the void. they do like contributing their time when they know it's going to get their issue fixed and make things better for everyone.

this is why I regularly tell people "FEED ME YOUR BUG REPORTS! I WANT THEM ALL!"

it's just what you have to do if you want your code to be truly bulletproof.

jcalvinowens 15 hours ago | parent [-]

> the btrfs devs are also famous for being unresponsive to these sorts

No, Kent, they are not. Posting attacks like this without evidence is cowardly and dishonest. I'm not going to tolerate these screeds from you about people I've worked with and respect without calling you out.

Every time you spew this toxicity, a chunk of bcachefs users reformat and walk away. Very soon, you'll have none left.

ziml77 2 days ago | parent | prev | next [-]

If you complain a knife crushing your food because you're holding it upside down, it's good for everyone else to know that context. Because anyone who is using it with the sharp side down can safely ignore that problem rather than being scared away due to an issue they won't experience.

metadat 2 days ago | parent | prev | next [-]

I've experienced unrecoverable corruption with btrfs within the past 2 years.

motorest 2 days ago | parent | prev [-]

> Ah yes, the famous "holding it wrong".

Is it wrong to ask how to reproduce an issue?