Remix.run Logo
gamblor956 3 hours ago

People really need to read their cites and not just the summaries.

The paper notes two things:

1) While the compression ratio for visual text is better than it is for regular text, but the absolute space required is still higher for the images. OPs were talking about the space required, not the ratio.

2) The results of the OCR must still be fed into a text-based LLM for linguistic processing. Otherwise, all you have achieved is turning an image into a bunch of text.