▲ | koverstreet 4 days ago | |||||||||||||||||||||||||||||||||||||||||||
that would be bcachefs :) It's an entirely clean slate design, and I spent years taking my time on the core planning out the design; it's as close to perfect as I can make it. The only things I can think of that I would change or add given unlimited time and budget: - It should be written in Rust, and even better a Rust + dependent types (which I suspect could be done with proc macros) for formal verification. And cap'n proto for on disk data structures (which still needs Rust improvements to be as ergonomic as it should be) would also be a really nice improvement. - More hardening; the only other thing we're lacking is comprehensive fault injection testing of on disk errors. It's sufficiently battle hardened that it's not a major gap, but it really should happen at some point. - There's more work to be done in bitrot prevention: data checksums really need to be plumbed all the way into the pagecache I'm sure we'll keep discovering new small ways to harden, but nothing huge at this point. Some highlights: - It has more defense in depth than any filesystem I know of. It's as close to impossible to have unrecoverable data loss as I think can really be done in a practical production filesystem - short of going full immutable/append only. - Closest realization of "filesystem as a database" that I know of - IO path options (replication level, compression, etc.) can be set on a per file or directory basis: I'm midway through a project extending this to do some really cool stuff, basically data management is purely declarative. - Erasure coding is much more performant than ZFS's - Data layout is fully dynamic, meaning you can add/remove devices at will, it just does the right thing - meaning smoother device management than ZFS - The way the repair code works, and tracking of errors we've seen - fantastic for debugability - Debugability and introspection are second to none: long bug hunts really aren't a thing in bcachefs development because you can just see anything the system is doing There's still lots of work to do before we're fully at parity with ZFS. Over the next year or two I should be finishing erasure coding, online fsck, failure domains, lots more management stuff... there will always be more cool projects just over the horizon | ||||||||||||||||||||||||||||||||||||||||||||
▲ | lifty 3 days ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||
Thanks for bcachefs and all the hard work you’ve put in it. It’s truly appreciated and hope you can continue to march on and not give up on the in-kernel code, even if it means bowing to Linus. On a different note, have you heard about prolly trees and structural sharing? It’s a newer data structure that allows for very cheap structural sharing and I was wondering if it would be possible to build an FS on top of it to have a truly distributed fs that can sync across machines. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
▲ | nullc 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
> - Erasure coding is much more performant than ZFS's any plans for much lower rates than typical raid? Increasingly modern high density devices are having block level failures at non-trivial rates instead of or in addition to whole device failures. A file might be 100,000 blocks long, adding 1000 blocks of FEC would expand it 1% but add tremendous protection against block errors. And can do so even if you have a single piece of media. Doesn't protect against device failures, sure, though without good block level protection device level protection is dicey since hitting some block level error when down to minimal devices seems inevitable and having to add more and more redundant devices is quite costly. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
▲ | Icathian 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
I happen to work at a company that uses a ton of capnp internally and this is the first time I've seen it mentioned much outside of here. Would you mind describing what about it you think would make it a good fit for something like bcachefs? | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
▲ | ZenoArrow 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
> Closest realization of "filesystem as a database" that I know of More so than BFS? https://en.m.wikipedia.org/wiki/Be_File_System "Like its predecessor, OFS (Old Be File System, written by Benoit Schillings - formerly BFS), it includes support for extended file attributes (metadata), with indexing and querying characteristics to provide functionality similar to that of a relational database." | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
▲ | a day ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
[deleted] | ||||||||||||||||||||||||||||||||||||||||||||
▲ | 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
[deleted] | ||||||||||||||||||||||||||||||||||||||||||||
▲ | m-p-3 3 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||
I'm saddened by this turn of event, but I hope this won't deter you from working on bcachefs on your own term and eventually see a reconciliation into the kernel at one point. Thank you for your hard work. |