Remix.run Logo
zaarn 2 days ago

ZFS isn’t viable for SQLite unless you turn off fsync’s in ZFS, because otherwise you will have the same experience I had for years; SQLite may randomly hang for up to a few minutes with no visible cause, if there isn’t sufficient write txg’s to fill up in the background. If your app depends on SQLite, it’ll randomly die.

Btrfs is a better choice for sqlite, haven’t seen that issue there.

Modified3019 2 days ago | parent | next [-]

Interesting. Found a GitHub issue that covers this bug: https://github.com/openzfs/zfs/issues/14290

The latest comment seems to be a nice summary of the root cause, with earlier in the thread pointing to ftruncate instead of fsync being a trigger:

>amotin

>I see. So ZFS tries to drop some data from pagecache, but there seems to be some dirty pages, which are held by ZFS till them either written into ZIL, or to disk at the end of TXG. And if those dirty page writes were asynchronous, it seems there is nothing that would nudge ZFS to actually do something about it earlier than zfs_txg_timeout. Somewhat similar problem was recently spotted on FreeBSD after #17445, which is why newer version of the code in #17533 does not keep references on asynchronously written pages.

Might be worth testing zfs_txg_timeout=1 or 0

jclulow a day ago | parent | prev | next [-]

This isn't an inherent property of ZFS at all. I have made heavy use of SQLite for years (on illumos systems) without ever hitting this, and I would never counsel anybody to disable sync writes: it absolutely can lead to data loss under some conditions and is not safe to do unless you understand what it means.

What you're describing sounds like a bug specific to whichever OS you're using that has a port of ZFS.

zaarn a day ago | parent [-]

I wouldn't recommend SQLite on ZFS (or in general for other reasons), for the precise reason that it either lags or is unsafe.

I've encountered this bug both on illumos, specifically OpenIndiana, and Linux (Arch Linux).

2 days ago | parent | prev | next [-]
[deleted]
throw0101b 2 days ago | parent | prev [-]

> ZFS isn’t viable for SQLite unless you turn off fsync’s in ZFS

Which you can do on a per dataset ('directory') basis very easily:

    zfs set sync=disabled mydata/mydb001
* https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops...

Meanwhile all the rest of your pools / datasets can keep the default POSIX behaviour.

ezekiel68 2 days ago | parent | next [-]

You know what's even easier than doing that? Neglecting to do it or meaning to do it then getting pulled in to some meeting (or other important distraction) and then imagining you did it.

throw0101b 2 days ago | parent [-]

> Neglecting to do it or meaning to do it then getting pulled in to some meeting (or other important distraction) and then imagining you did it.

If your job is to make sure your file system and your database—SQLite, Pg, My/MariDB, etc—are tuned together, and you don't tune it, then you should be called into a meeting. Or at least the no-fault RCA should bring up remediation methods to make sure it's part of the SOP so that it won't happen again.

The alternative the GP suggests is using Btrfs, which I find even more irresponsible than your non-tuning situation. (Heck, if someone on my sysadmin team suggested we start using Btrfs for anything I would think they were going senile.)

johncolanduoni 2 days ago | parent [-]

Facebook is apparently using it at scale, which surprised me. Though that’s not necessarily an endorsement, and who knows what their kernel patcheset looks like.

zaarn a day ago | parent | prev | next [-]

Disabling sync corrupts SQLite databases on powerloss, I've personally experienced this following disabling sync because it causes SQLite to hang.

You cannot have SQLite keep your data and run well on ZFS unless you make a zvol and format it as btrfs or ext4 so they solve the problem for you.

kentonv 2 days ago | parent | prev [-]

Doesn't turning off sync mean you can lose confirmed writes in a power failure?