| ▲ | wahern 6 hours ago | |
It should be much easier than that. You should should be able to serially test if each edit decodes to a sane PDF structure, reducing the cost similar to how you can crack passwords when the server doesn't use a constant-time memcmp. Are PDFs typically compressed by default? If so that makes it even easier given built-in checksums. But it's just not something you can do by throwing data at existing tools. You'll need to build a testing harness with instrumentation deep in the bowels of the decoders. This kind of work is the polar opposite of what AI code generators or naive scripting can accomplish. | ||
| ▲ | pimlottc 4 hours ago | parent | next [-] | |
I wonder if you could leverage some of the fuzzing frameworks tools like Jepsen rely on. I’m sure there’s got to be one for PDF generation. | ||
| ▲ | cluckindan 6 hours ago | parent | prev [-] | |
On the contrary, that kind of one-off tooling seems a great fit for AI. Just specify the desired inputs, outputs and behavior as accurately as possible. | ||