Remix.run Logo
hallole 15 hours ago

Thanks for this. Really quells the urge I get every so often to just code my own PDF editor, because they all suck and certainly it couldn't be THAT hard. Such hubris!

brailsafe 12 hours ago | parent | next [-]

Heh, have at it, here's the full spec: https://developer.adobe.com/document-services/docs/assets/5b...

Should take... a weekend tops? ;) PDF is crazy and scary

marcosdumay 9 hours ago | parent | next [-]

> PDF includes eight basic types of objects: Boolean values, Integer and Real numbers, Strings, Names, Arrays, Dictionaries, Streams, and the null object

Wait, this is more complete than SOAP. It may be a good idea to redo the IPC protocol with a different serialization format!

jaggederest 7 hours ago | parent [-]

Well, it's a descendant of Postscript (much like JSON is a descendant of Javascript, loosely)

Society would probably never recover if we started implementing RPC-in-Postscript though.

embedding-shape 11 hours ago | parent | prev | next [-]

7.5.6 "Incremental updates" from the specification is an interesting section too, speaking about accessing data people didn't think to remove from PDF files properly.

CamperBob2 11 hours ago | parent | prev [-]

We will be able to say that AGI has arrived when we can hand that spec off to a model and tell it to build an Acrobat clone.

exasperaited 3 hours ago | parent [-]

We will be able to say that AGI has arrived when the AI hands it back and says "no".

gregsadetsky 13 hours ago | parent | prev | next [-]

Don't stop yourself before getting started. I believe in you - maybe you could write the one editor that would actually work!

Not kidding - it's a ~~~billion dollar market haha

Make an MVP/Show HN :-)

kayodelycaon 10 hours ago | parent | prev | next [-]

I did a bunch of work creating pdfs using a low-level API, object goes here stuff.

As far as I understand it, at its core, pdf is just a stream of instructions that is continually modifying the document. You can insert a thousand objects before you start the next word in a paragraph. And this is just the most basic stuff. Anything on a page can be anywhere in the stream. I don't know if you can go back and edit previous pages, you might have a shot at least trying to understand one page at a time.

Did you know you can have embedded XML in PDFs? You can have a paper form with all the data filled in and include an XML version of that for any computer systems that would like an easier way to read it.

TRiG_Ireland 10 hours ago | parent | prev | next [-]

The blog post about adding colour gradients to Typst dives into some of the weirdness of the format. https://typst.app/blog/2023/color-gradients

NamTaf 11 hours ago | parent | prev [-]

Bravo to you for recognising the load-bearing 'just' before you threw it around :)