▲ | deepsquirrelnet 3 days ago | |||||||
Give the nanonets-ocr-s model a try. It’s a fine tune of Qwen 2.5 vl which I’ve had good success with for markdown and latex with image captioning. It uses a simple tagging scheme for page numbers, captions and tables. | ||||||||
▲ | davidwritesbugs 3 days ago | parent | next [-] | |||||||
I've tried nanonets but it seems very sensitive to the prompt, changing it slightly turned the output to rubbish. When it worked it was pretty good. | ||||||||
| ||||||||
▲ | captainregex 3 days ago | parent | prev [-] | |||||||
I desperately wanted Qwen vl to work but it just unleashes rambling hallucinations off basic screencaps. going to try nanonet! |