Remix.run Logo
cpursley 5 days ago

How are you prepping the PDF data before shoving it into Qwen?

Alifatisk 5 days ago | parent | next [-]

I just compress the file size as low as possible without losing the quality, didn't even know there was more ways to prep it.

I do sometimes chop up the PDF into smaller pdfs with their own individual chapters

amelius 5 days ago | parent [-]

On Linux you can use pdftotext also if you are only concerned with the text.

navbaker 5 days ago | parent | prev [-]

Not OP, but we use the docling library to extract text and put it in markdown before storing for use with an LLM.