Remix.run Logo
ljouhet 5 hours ago

Real question: what tool do you use? (for long/complex documents with tables, code, maths)

- marker (with --force-ocr) gives me the best results

- Mistral OCR (seems really great, but I never managed to get it work)

- Mathpix (tried a long time ago)

- docling (gives me garbage, I must use it wrong)

- Unlimited OCR (will try it)

- ???

Oras 4 hours ago | parent | next [-]

- Azure Document Intelligence (has an option to return markdown too including headers and footers).

- AWS Textract

badlibrarian 4 hours ago | parent [-]

Exactly. They're both very expensive and prone to surprising you. Sometimes in a good way, sometimes in a bad way. I'd rate them 85%, but you have to run a test because they both fail in different ways on the 15%.

ai_fry_ur_brain 4 hours ago | parent | prev [-]

poma-ai has really great chunking techniques that chunk the document based on the document structure/heirarchy.

We use it on 200 page IEEE standards that are notoriously complex, filled with tables and diagram. Highly reccomend.