Remix.run Logo
klodolph 3 days ago

I don’t have any respect for the viewpoint that “durable” is equatable with “stored on disk”, and I don’t want to spend time accommodating that viewpoint. It is just an oversimplification in a very bad way.

AFRs and discussions about different failure scenarios are the bare minimum. The bare minimum for scenarios is disk loss, total machine loss, and data center loss. This is just my take on things. I don’t care if something is on disk or not. I do care what happens when a sector on disk goes bad, when a faulty power supply destroys all the disks in a machine, or when a data center floods.

That forces you to think about things like whether you want to turn on synchronous replication.

jakewins 2 days ago | parent [-]

The point of “durable” implying stored to durable media is precisely that it allows the operator of the system to make that kind of calculation. They know the disks they picked and the replication chosen, and as long as the database calls fsync, their calculations will work.

My beef is with database systems that use the argument you made further up thread to skip fsync to juice their performance numbers. Data is not “durable” if turning off the machines storing it means it’s lost, that’s a category difference, not a pure probability difference as you are claiming.

It is of course totally fine to not store data to durable media and say the risk of devops doing a coordinated reboot is as low as the risk of raid disk data loss, but then don’t use the word “durable”.

klodolph 7 hours ago | parent [-]

That definition of durable doesn’t seem useful to me, sorry. I want the failure rates and scenarios.