Have you tried paperless-ngx, a true and tested open source solution that's been filling this niche successfully for decades now?
They, too, offer integrations for LLMs these days, presumably for better OCR and classification.