| ▲ | Xotic007 2 hours ago | |||||||
Cool but it's relying on every extractor honoring that replacement-text property which you said yourself is hit or miss. So it's clean markdown until someone runs it through a tool that ignores it and quietly gets the messy version and has no idea that happened. | ||||||||
| ▲ | SarthakGaud an hour ago | parent [-] | |||||||
From my trials, it fails with OCR but works with popular libs like pypdf2 etc | ||||||||
| ||||||||