Remix.run Logo
nojito a day ago

The best way to parse pdfs is to convert them to images and feed them into the llm.

This workflow is highly optimized.

wpasc a day ago | parent | next [-]

For sure there are very optimized ways to do it. My point is that a non technical user will drag and drop a pdf into a chatbot. and from a UX/product perspective, they should have to think about it more than that IMO. but seemingly, that's very much an expensive, inefficient way of doing it (burning through a whole context window try to read it, reloading it multiple times per conversation, etc.).

seemaze a day ago | parent | prev [-]

Absolutely this. Never try to parse a native PDF document with any expectation of coherence or consistency.