| ▲ | ismailmaj 2 days ago | ||||||||||||||||
I don't know why people mess with tesseract in 2026, attention-based OCRs (and more recently VLMs) outperformed any LSTM-based approach since at least 2020. My guess is that it's the entry-point to OCR and the internet is flooded by that, just like pandas for data processing. | |||||||||||||||||
| ▲ | mettamage 2 days ago | parent | next [-] | ||||||||||||||||
Painful comparison haha Leaving a comment so I can more easily find this And for the people wondering about Pandas, use Polars instead | |||||||||||||||||
| ▲ | eichin 2 days ago | parent | prev | next [-] | ||||||||||||||||
I was surprised to learn (from this article) that there are local models that can do this (not sure if there are any that run on hardware I actually have though, unlike Tesseract which works fine on the scanning hardware I set up for it ~5 years ago.) For privacy reasons, cloud-based OCR is a non-starter... | |||||||||||||||||
| |||||||||||||||||
| ▲ | petercooper 2 days ago | parent | prev | next [-] | ||||||||||||||||
Quite, I threw a so-so photo of an old, long receipt at Qwen 3.5 0.8MB (runs in <2GB) and it nailed spitting 20+ items out in under a second. AI is good at many things, but picking modern dependencies not so much. | |||||||||||||||||
| |||||||||||||||||
| ▲ | segmondy a day ago | parent | prev [-] | ||||||||||||||||
yup, deepseek-ocr-2 will have crushed this. then there's glm-ocr, dots-ocr, etc, paddle-ocr-vl, etc tons of options ... | |||||||||||||||||