Remix.run Logo
kelnos a day ago

> Please don't turn nice formats into a format that's similar to screenshots of text

Converting HTML to PDF shouldn't result in an image wrapped in a PDF. Text will be preserved as text in the final PDF. (Unless the converter is garbage, of course.)

Aachen a day ago | parent [-]

If you've ever copied text out of a PDF, you'll know it's not the original text anymore. Besides ligatures, you get broken sentences with extra hyphens inserted in wrong places (that were word/line breaks in the PDF-rendered version), if it'll properly let you select more than a few words at all. It works like "put these couple words at position x,y" and not (html's) semantic "here comes a heading" tag that helps people accessibly read your text, and if you're not suffering from any impairment or mobile devices with narrower screens than this particular render was designed for, it also lets you work with the document more easily. It's like you remove all HTML and keep only the CSS: all definitions of what's a section, sentence, emphasis, or caption are gone

I didn't mean literally an image, hence saying image-like. You get similar limitations to when using OCR, which seems very image-like to me