▲ | __rito__ 6 days ago | |||||||
I have tried a bunch of things. This is what worked best for me: Surya [0]. It can run fully local on your laptop. I also tried EasyOCR [1], which is also quite good. I haven't tried this myself, but I will look at Paddle [2] if the previous two don't float your boat. All of these are OSS, and you don't need to pay a dime to anyone. [0]: https://github.com/VikParuchuri/surya | ||||||||
▲ | pmarreck 5 days ago | parent | next [-] | |||||||
Got some questions (sorry for necro, but I only discovered this thread by accident because I left it open in a tab and it turns out to be super-relevant to me): I have some out-of-print books that I want to convert into nice pdf's/epubs (like, reference-quality) 1) I don't mind destroying the binding to get the best quality. Any idea how I do so? 2) I have a multipage double-sided scanner (fujitsu scansnap). would this be sufficient to do the scan portion? 3) Is there anything that determines the font of the book text and reproduces that somehow? and that deals with things like bold and italic and applies that either as markdown output or what have you? 4) how do you de-paginate the raw text to reflow into (say) an epub or pdf format that will paginate based on the output device (page size/layout) specification? | ||||||||
| ||||||||
▲ | pmarreck 5 days ago | parent | prev | next [-] | |||||||
Wow, Surya looks legit! https://www.datalab.to/ | ||||||||
▲ | carlosjobim 5 days ago | parent | prev [-] | |||||||
I would like to pay a dime and more for any of these solutions discussed in the thread as a normal MacOS program with a graphical user interface. |