Remix.run Logo
TheDong 2 days ago

> ZFS is usually not recommended for databases

Say more? I've heard people say that ZFS is somewhat slower than, say, ext4, but I've personally had zero issues running postgres on zfs, nor have I heard any well-reasoned reasons not to.

> What filesystems in the wild typically provide for this is weaker than what is advisable for a database, so databases should bring their own implementation.

Sorry, what? Just yesterday matrix.org had a post about how they (using ext4 + postgres) had disk corruption which led to postgres returning garbage data: https://matrix.org/blog/2025/07/postgres-corruption-postmort...

The corruption was likely present for months or years, and postgres didn't notice.

ZFS, on the other hand, would have noticed during a weekly scrub and complained loudly, letting you know a disk had an error, letting you attempt to repair it if you used RAID, etc.

It's stuff like in that post that are exactly why I run postgres on ZFS.

If you've got specifics about what you mean by "databases should bring their own implementation", I'd be happy to hear it, but I'm having trouble thinking of any sorta technically sound reason for "databases actually prefer it if filesystems can silently corrupt data lol" being true.

zaarn 2 days ago | parent | next [-]

SQLite on ZFS needs the Fsync behaviour to be off, otherwise SQLite will randomly hang the application as the fsync will wait for the txg to commit. This can take a minute or two, in my experience.

Btrfs is a better choice for SQLite.

supriyo-biswas 2 days ago | parent | next [-]

Btw this concern also applies to other databases, although probably it manifests in the worst way in SQLite. Essentially, you’re doing a WAL over the file systems’ own WAL-like recovery mechanism.

zaarn a day ago | parent [-]

I've not observed other databases locking up on ZFS, Postgres and MySQL both function just fine, without needing to modify any settings.

throw0101b 2 days ago | parent | prev [-]

> SQLite on ZFS needs the Fsync behaviour to be off […]

    zfs set sync=disabled mydata/mydb001
* https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops...
zaarn a day ago | parent [-]

As noted in a sibling comment, this causes corruption on power failure.

jandrewrogers 2 days ago | parent | prev [-]

The point is that a database cannot rely on being deployed on a filesystem with proper checksums.

Ext4 uses 16-/32-bit CRCs, which is very weak for storage integrity in 2025. Many popular filesystems for databases are similarly weak. Even if they have a strong option, the strong option is not enabled by default. In real-world Linux environments, the assumption that the filesystem has weak checksums usually true.

Postgres has (IIRC) 32-bit CRCs but they are not enabled by default. That is also much weaker than you would expect from a modern database. Open source databases do not have a good track record of providing robust corruption detection generally nor the filesystems they often run on. It is a systemic problem.

ZFS doesn't support features that high-performance database kernels use and is slow, particularly on high-performance storage. Postgres does not use any of those features, so it matters less if that is your database. XFS has traditionally been the preferred filesystem for databases on Linux and Ext4 will work. Increasingly, databases don't use external filesystems at all.

mardifoufs 2 days ago | parent [-]

I know MySQL has checksums by default, how does it compare? Is it useful or is it similarly weak?

jandrewrogers 2 days ago | parent [-]

I don't know but LLMs seem to think it uses a 32-bit CRC like e.g. Postgres.

In fairness, 32-bit CRCs were the standard 20+ years ago. That is why all the old software uses them and CPUs have hardware support for computing them. It is a legacy thing that just isn't a great choice in 2025.

a day ago | parent [-]
[deleted]