Remix.run Logo
cogman10 7 months ago

Rather than batching, I would want a "NO ROLLBACK DELETE" sort of command. The real expensive part of the delete is rewriting the records into the transaction log so that a cancel or crash can undo the delete.

If you've gone to the effort of batching things, you are still writing out those records, you are just giving the db a chance to delete them from the log.

I'd like to save my ssds that heartache and instead allow the database to just delete.

In MSSQL in some extreme circumstances, we've partitioned our tables specifically so we can use the 'TRUNCATE TABLE' command as delete is just too expensive.

That operation can wipe gbs in seconds.

mike_hearn 7 months ago | parent | next [-]

Yes the commercial databases make it easier to handle this.

One simple way in Oracle is to take a table lock, copy the data you want to preserve out to a temporary table, truncate the target table, copy the data back in.

Kinrany 7 months ago | parent | prev | next [-]

What happens when two transactions select and then delete two rows in the opposite order while requesting "no rollback"?

pixl97 7 months ago | parent | prev [-]

I would say it can unlink gbs in seconds. The data is still on the disk until it's trimed or overwritten.

immibis 7 months ago | parent [-]

So why does it need to be copied into the WAL log until vacuum runs?

And vacuum is not expected or required to be atomic, since it deletes data that was necessarily unreferenced anyway, so it also shouldn't need to copy the old data into WAL files.

mattashii 7 months ago | parent [-]

Many DBMSs with index-oriented storage (MySQL, Oracle, MSSQL) use undo logging for a transaction's MVCC, so that for deletion the old version is put into the undo log of that transaction and referred to as an old version of the record (or page, or ...), immediately cleaning up space on the page for new data while the transaction is still goin on. This is great for short transactions and record updates, as a page only has to hold one tuple version at a time, but that is at the cost of having to write the tuples that are being removed into a log, just in case the transaction needs to roll back.

immibis 7 months ago | parent [-]

The space isn't immediately cleaned up because of Postgres's version-based MVCC. It should only need to record that it marked the row as deleted, and the vacuum shouldn't need to record anything because it isn't atomic.

mattashii 7 months ago | parent [-]

Yes, but that's in PostgreSQL, not in MSSQL or the other systems I described (and which the gps seemed to refer to)