| ▲ | fngjdflmdflg 6 hours ago | ||||||||||||||||
These OCR improvements will almost certainly be brought to google books, which is great. Long term it can enable compressing all non-digital rare books into a manageable size that can be stored for less than $5,000.[0] It would also be great for archive.org to move to this from Tesseract. I wonder what the cost would be, both in raw cost to run, and via a paid API, to do that. | |||||||||||||||||
| ▲ | levocardia 2 hours ago | parent | next [-] | ||||||||||||||||
This is a really interesting "data flywheel" -- better model >> more usable data >> even better model | |||||||||||||||||
| |||||||||||||||||
| ▲ | kridsdale3 4 hours ago | parent | prev [-] | ||||||||||||||||
More Data for the Data Gods! | |||||||||||||||||