Quick and easy: Gemini Flash 2
More of a system: AWS Textract or Azure Document Intelligence. This option requires some coding and the cost is higher than using a vision model.