Remix.run Logo
jrockway 2 days ago

I think it's kind of possible to partially apply the WAL manually. Imagine your frames are:

1) Insert new subscription for "foobar @ 123 Fake St." 2) Insert new subscription for "�#�#�xD�{.��t��3Axu:!" 3) Insert new subscription for "barbaz @ 742 Evergreen Terrace"

A human could probably grab two subscriptions out of that data loss incident. I think that's what they're saying. If you're very lucky and want to do a lot of manual work, you could maybe restore some of the data. Obviously both of the "obviously correct" records could just be random bitflips that happen to look right to humans. There's no way of knowing.

cwillu 2 days ago | parent | next [-]

And if the “obviously correct” last entry is actually an old entry that just hadn't been overwritten yet? Or if it was only permitted because of something in the corrupted section?

The database should absolutely not be performing guesswork about the meaning of its contents during recovery. If you want mongodb, go use mongodb.

jrockway 2 days ago | parent [-]

I'm aware that nothing should automatically attempt this recovery, but asking the database to throw an error when it happens does not seem crazy to me. I mean, I get why it doesn't. I have worked on systems with "mount /dev/sda1 || mkfs.ext4 /dev/sda1" because there isn't a UI to recover the filesystem, so nothing is lost by erasing it. But I don't think it makes you bad at databases to want this condition to be optionally fatal. (I can also see why "welp it's corrupted <unlink>" is also annoying. If I'm the author of this software, I would be interested in checking out the corrupted file before deleting it. I can live with the slight hit to my free disk space if it means I can fix a bug!)

ulrikrasmussen 2 days ago | parent | prev | next [-]

Yes, in this particular example you could. But in general the database cannot make assumptions that changes are independent of each other.

I think SQLite assumes that a failing checksum occurs due to a crash during a write which never finished. A corrupt WAL frame before a valid frame can only occur if the underlying storage is corrupt, but it makes no sense for SQLite to start handling that during replay as it has no way to recover. You could maybe argue that it should emit a warning

supriyo-biswas 2 days ago | parent | prev [-]

This could work for a simple key-value store; but SQLite also does referential integrity which means we might just end up with extra entries with no entities on the other side of the table. IMO, best avoided in a transactional database.