Remix clone Hacker News

new | show | ask | jobs Github

	▲	brumar 2 hours ago
		Tangentially related: I don't think OCR is the right term and I am generally vocal about that. But seeing this unquestioned here, I am wondering if I am the one who is wrong here. Is it ok to call this OCR? To me ocr means text in the end, not visual tokens.
	▲	parsimo2010 2 hours ago \| parent \| next [-]
		OCR means optical character recognition. The terms do not require a direct transcription, but that is mostly what OCR meant in the past. If you’re using an LLM’s vision capability to pass in text and the LLM actually understands it, then I would say that it recognized the characters, hence OCR seems okay to use.
	▲	TurdF3rguson 2 hours ago \| parent \| prev \| next [-]
		It's not. OCR is not what the vision model is doing here. We're used to using OCR as a verb but it's more accurate to say the model "visioned" it. Also, some models still do OCR and it's usually way more expensive that way.
	▲	devmor 2 hours ago \| parent \| prev [-]
		So if I OCR a document, edit it, and print it, OCR didn't happen?