Remix.run Logo
burntcaramel 7 months ago

If data isn’t actually removed until vacuuming, then are systems that perform SQL DELETES actually GDPR compliant? Because technically the private data is still there on disk and could be recovered. “Until the autovacuum process or a manual VACUUM operation reclaims the space, the “deleted” data remains.”

dataflow 7 months ago | parent | next [-]

Even vacuuming wouldn't actually destroy the data right? Because filesystems don't guarantee they will overwrite or wipe any particular disk blocks. And even if they did, SSDs still wouldn't promise that the blocks aren't remapped instead of being wiped & reused.

Polizeiposaune 7 months ago | parent | next [-]

> Because filesystems don't guarantee they will overwrite or wipe any particular disk blocks.

Some filesystems have a richer interface to the underlying storage device, allowing them to invoke commands such as ATA TRIM or SCSI UNMAP - either incrementally as blocks are freed, or on demand - which request that the underlying storage device forget the block contents.

So the necessary interfaces exist and are widely available, and even if imperfect they improve the situation.

dataflow 7 months ago | parent [-]

> Some filesystems have a richer interface to the underlying storage device, allowing them to invoke commands such as ATA TRIM or SCSI UNMAP

No, that's not a guarantee of data erasure. Not just because it's just a request that the device can disregard, but also because filesystems play tricks (like storing small bits of data inline, or logging data in various places, etc.) and they don't clear all those blocks just because you wanted to clear a couple bytes.

__turbobrew__ 7 months ago | parent | prev [-]

Yea the only way to be sure that data is gone is through mechanical destruction (shredding) of the drives. Sometimes you can write something to a SSD and then not be able to delete it due to a hardware fault, but the data can still be read.

I wonder if a GDPR nation has made a ruling on the extent of data erasure? Surely you cannot expect a company to shred a SSD every time someone asks for their data to be deleted.

With our current understanding of physics you cannot destroy information outside of maybe throwing something in a black hole — and even then you may still be able to get the information back from hawking radiation after many eons — so the question is how much should we scramble information before it is considered “deleted”?

dataflow 7 months ago | parent [-]

> I wonder if a GDPR nation has made a ruling on the extent of data erasure?

My understanding (based on a couple random conversations, take it with a grain of salt) is that at least some entities are taking the sheer difficulty of true compliance with the letter of the law to imply that softer deletion methods have to be reasonably acceptable, and their stance is basically "if you disagree, well, take us to court and we'll figure it out."

tzs 7 months ago | parent | prev | next [-]

It's also likely still somewhere on backups.

GDPR supervisory authorities disagree on what to do about data in backups. France has said you don't have to delete data from backups. The Danish authorities have said that you have to delete from backups where it is technically possible. The UK (which still has GDPR) has said that you must put the data "beyond use" which most have taken to mean that you have to make sure that if you ever restore data from the backup you will omit the data that is supposed to be forgotten.

I don't know what other supervisor authorities have said--those three are just the ones that tend to show up when searching on this topic. I would not be surprised if there are at least a dozen other different opinions from the rest.

konha 7 months ago | parent | prev | next [-]

Yes. GDPR allows for delays when complying with deletion requests. You should ideally document it and factor the delay into any deadlines you might be bound to.

You’d need to make sure the process is somewhat predictable, like running the vacuum on a set schedule so you know for sure what maximum amount of time a deletion request will take.

lucianbr 7 months ago | parent | prev [-]

If vacuum runs at least once per day, seems pretty GDPR compliant to me. Even if it runs once every two or three days.

Now if your database runs vacuum once every 6 months, yeah, DELETE might not actually be a delete. But is it really a GDPR issue? What's really going on in this system?

I don't think any EU agency is going to fine your company if the data you say you deleted survived 6 or even 60 hours after deletion, if that is the end of it.