Remix.run Logo
kbaker 4 days ago

OK, interesting, I think I see... So you are asking about if SQLite opens and finds a not-committed rollback journal that looks valid, then it rolls it back?

I was more curious so I looked at the code here:

https://sqlite.org/src/file?name=src/pager.c&ci=trunk

and found something similar to what you are asking in this comment before `sqlite3PagerCommitPhaseTwo`:

    ** When this function is called, the database file has been completely
    ** updated to reflect the changes made by the current transaction and
    ** synced to disk. The journal file still exists in the file-system
    ** though, and if a failure occurs at this point it will eventually
    ** be used as a hot-journal and the current transaction rolled back.
So, it does this:

    ** This function finalizes the journal file, either by deleting,
    ** truncating or partially zeroing it, so that it cannot be used
    ** for hot-journal rollback. Once this is done the transaction is
    ** irrevocably committed.
Assuming fsync works on both the main database and the hot journal, then I don't see a way that it is not durable? Because, it has to write and sync the full hot journal, then write to the main database, then zero out the hot journal, sync that, and only then does it atomically return from the commit? (assuming FULL and DELETE)
agwa 3 days ago | parent [-]

> OK, interesting, I think I see... So you are asking about if SQLite opens and finds a not-committed rollback journal that looks valid, then it rolls it back?

Right.

> Assuming fsync works on both the main database and the hot journal, then I don't see a way that it is not durable? Because, it has to write and sync the full hot journal, then write to the main database, then zero out the hot journal, sync that, and only then does it atomically return from the commit? (assuming FULL and DELETE)

DELETE mode doesn't truncate or zero the journal; it merely deletes it from the directory. You need to switch to TRUNCATE or PERSIST for that behavior: https://sqlite.org/pragma.html#pragma_journal_mode

I confirmed all of this by attaching gdb to SQLite and setting a breakpoint on unlink. At the time of unlink the journal is still "hot" and usable for rollback: https://news.ycombinator.com/item?id=45069533

kbaker 3 days ago | parent [-]

Yeah, I see the comments down below in pager.c which explain it a bit better. I guess I thought its behavior was more like PERSIST by default.

    **   journalMode==DELETE
    **     The journal file is closed and deleted using sqlite3OsDelete().
    **
    **     If the pager is running in exclusive mode, this method of finalizing
    **     the journal file is never used. Instead, if the journalMode is
    **     DELETE and the pager is in exclusive mode, the method described under
    **     journalMode==PERSIST is used instead.
So I guess this is one of the tradeoffs SQLite makes between extreme durability with (PERSIST, TRUNCATE, or using EXTRA) vs speed.

I know we have used SQLite quite a bit without getting into this exact scenario - or at least the transaction that would have been rolled back wasn't important enough in our case, I guess when the device powers down milliseconds later to trigger this case we never noticed (or needed this transaction), only that the DB remained consistent.

And it seems like if DELETE is the default and doesn't get noticed enough in practice that it needs to be changed to a different safer default (like to PERSIST or something,) I guess it is better to have the speed gains from reducing that extra write + fsync.

But, now I guess you know enough to submit better documentation upstream for a sliding scale of durability, I definitely agree that the docs could be better about how to get ultimate durability, and how to tune the knobs for `journal_mode` and `synchronous`.