Remix clone Hacker News

new | show | ask | jobs Github

	▲	MattRogish 2 hours ago
		I do OCR of images, and that's exactly what I do. I take one big image and slice it into many smaller ones, and send those to the LLM. Perfect every time, unlike using the whole image which resulted in hot garbage.
	▲	freefaler an hour ago \| parent \| next [-]
		It works with relatively good scans, when there are bad/skewed scans and especially something with many label/value pairs, that aren't nicely tucked inside sentences, the more context you have, the more you can find the correct words and fix the errors. There is a whole class of tricky documents. A decent (if you ignore the marketing bias) post about this problem can be found here: https://getomni.ai/blog/ocr-benchmark
	▲	ryanisnan 34 minutes ago \| parent \| prev [-]
		How do you know where to slice an image? What if you slice an image mid-word?