| ▲ | qingcharles 2 hours ago | |
I am trying to get rough summaries of long PDFs of scanned pages of text. At first I was doing OCR and passing the (tens of thousands of) characters into the LLM, which works, but it's expensive. I asked Gemini how to save costs and it said just send in all the images of the pages instead. Instinctively, as a developer, it's hard to fathom how sending 200 images is cheaper than sending the text, but it definitely works. | ||