You'd think the go-to workflow for releasing redacted PDFs would be to draw black rectangles and then rasterize to image-only PDFs :shrug:

▲

selinkocalar 5 hours ago | parent | next [-]

As someone who's built an entire business on "anti-screenshots" this is brilliant.

PDF redaction fails are everywhere and it's usually because people don't understand that covering text with a black box doesn't actually remove the underlying data.

I see this constantly in compliance. People think they're protecting sensitive info but the original text is still there in the PDF structure.

	▲	embedding-shape 5 hours ago \| parent [-]
		Not to mention some PDF editors preserve previous edits in the PDF file itself, which people also seems unaware of. A bit more user friendly description of the feature without having to read the specification itself: https://developers.foxit.com/developer-hub/document/incremen...

▲

shbooms 5 hours ago | parent | prev | next [-]

often times you will have requirements that the documents you release be digitally searchable and so in these cases, this would not be an option

	▲	pottertheotter 4 hours ago \| parent \| next [-]
		This made me think of something I came across recently that’s almost the opposite problem of requiring PDFs to be searchable. A local government would publish PDFs where the text is clearly readable on screen, but the selectable text layer is intentionally scrambled, so copy/paste or search returns garbage. It's a very hostile thing to do, especially with public data!
	▲	8note 5 hours ago \| parent \| prev [-]
		run some ocr on them after to recreate the text layer?

▲

4 hours ago | parent | prev [-]

[deleted]