Remix.run Logo
treetalker 4 days ago

I presume this doesn't handle handwriting.

Does anyone have a suggestion for locally converting PDFs of handwriting into text, say on a recent Mac? Use case would be converting handwritten journals and daily note-taking.

nawazgafar 4 days ago | parent | next [-]

Author here, I tested it with this PDF of a handwritten doc [1], and it converted both pages accurately.

1. https://github.com/pnshiralkar/text-to-handwriting/blob/mast...

treetalker 4 days ago | parent [-]

Amazing, can't wait to try it!

FYI, your GitHub link tells me it's unable to render because the pdf is invalid.

simonw 4 days ago | parent | prev | next [-]

This one should handle handwriting - it's using Qwen 2.5 VL which is a vision LLM that is very good at handwritten text.

password4321 4 days ago | parent | prev | next [-]

I don't know re: handwriting so only barely relevant but here is a new contender for a CLI "OCR Tool using Apple's Vision Framework API": https://github.com/riddleling/macocr which I found while searching for this recent discussion:

My iPhone 8 Refuses to Die: Now It's a Solar-Powered Vision OCR Server

https://news.ycombinator.com/item?id=44310944

phren0logy 3 days ago | parent [-]

If you use Docling, you can set your OCR engine to OCRMac then set it to use LiveText. It’s a good arrangement. You can send these as command-line arguments, but I generally configure it from the Python API.

ntnsndr 4 days ago | parent | prev [-]

+1. I have tried a bunch of local models (albeit the smaller end, b/c hardware limits), and I can't get handwriting recognition yet. But online Gemini and Claude do great. Hoping the local models catch up soon, as this is a wonderful LLM use case.

UPDATE: I just tried this with the default model on handwriting, and IT WORKED. Took about 5-10 minutes on my laptop, but it worked. I am so thrilled not to have to send my personal jottings into the cloud!