Remix.run Logo
kbyatnal 4 hours ago

Deepseek OCR is no longer state of the art. There are much better open source OCR models available now.

ocrarena.ai maintains a leaderboard, and a number of other open source options like dots [1] or olmOCR [2] rank higher.

[1] https://www.ocrarena.ai/compare/dots-ocr/deepseek-ocr

[2] https://www.ocrarena.ai/compare/olmocr-2/deepseek-ocr

ckrapu 4 hours ago | parent | next [-]

I wasn't aware of dots when I wrote the blog post. This is really good to know!! I would like to try again with some newer models.

segmondy 4 hours ago | parent | prev | next [-]

you are comparing to DeepSeek's old OCR, there's DeepSeek-OCR2 which btw is amazing from my experimentations. https://huggingface.co/deepseek-ai/DeepSeek-OCR-2

tclancy 4 hours ago | parent | prev [-]

The article mentions choosing the model for its ability to parse math well.