|
| ▲ | nemothekid 2 days ago | parent | next [-] |
| >I corrupt an old frame that is not needed by the database at all 1. How does the database know that. 2. In your example Alice gets the money from nowhere. What if another user had sent the money to Alice and that frame get corrupted. Then you just created 10,000,000 from nowhere. At the very least, rolling back to a good point gives you an exact moment of time where transactions can be applied from. Your example is very contrived and in a database where several transactions can be happening, doing a partial recovery will destroy the consistency of the database. |
|
| ▲ | NortySpock 2 days ago | parent | prev | next [-] |
| The demo is good, through I wish you had presented it as text rather than as a gif. I do see your point of wanting an option to refuse to delete the wal so a developer can investigate the wal and manually recover... But the the typical user probably wants the database to come back up with a consistent, valid state if power is lost. They do not want have the database refuse to operate because it found uncommitted transactions in a scratchpad file... As a SQL-first developer, I don't pick apart write-ahead logs trying to save a few bytes from the great hard drive in the sky, I just want the database to give me the current state of my data and never be in an invalid state. |
| |
| ▲ | avinassh 2 days ago | parent [-] | | > As a SQL-first developer, I don't pick apart write-ahead logs trying to save a few bytes from the great hard drive in the sky, I just want the database to give me the current state of my data and never be in an invalid state. Yes, that is a very valid choice. Hence, I want databases to give me an option, so that you can choose to ignore the checksum errors and I can choose to stop the app and try to recover. | | |
| ▲ | lxgr 2 days ago | parent [-] | | WAL checksum errors are not used to catch data corruption, they are only used to detect incomplete writes or leftover data from previous checkpoints. If you attempt to do the former in a system that by design uses checksums only for the latter, you'll actually introduce corrupted data from some non-corrupted WAL files. |
|
|
|
| ▲ | lxgr 2 days ago | parent | prev | next [-] |
| What about all classes of durability errors that do not manifest in corrupted checksums (e.g. "cleanly" lost appends of entire transactions to the WAL)? It seems like you're focusing on a very specific failure mode here. Also, what if the data corruption error happens during the write to the actual database file (i.e. at WAL checkpointing time)? That's still 50% of all your writes, and there's no checksum there! |
|
| ▲ | ncruces 2 days ago | parent | prev [-] |
| > Since WAL corruption issues are silent, we cannot do that now. You do have backups, right? |
| |
| ▲ | avinassh 2 days ago | parent [-] | | Yes. Say if I am using something like litestream, then all subsequent generations will be affected. At some point, I'd go back and find out the broken generation. Instead of that, I'd prefer for it to fail fast |
|