How is the text extraction done? Tesseract?
I just use an LLM with a prompt (pls dont hate). Found tesseract to be very bad for text extraction.