Remix clone Hacker News

new | show | ask | jobs Github

	▲	qingcharles 2 hours ago
		I am trying to get rough summaries of long PDFs of scanned pages of text. At first I was doing OCR and passing the (tens of thousands of) characters into the LLM, which works, but it's expensive. I asked Gemini how to save costs and it said just send in all the images of the pages instead. Instinctively, as a developer, it's hard to fathom how sending 200 images is cheaper than sending the text, but it definitely works.