Remix.run Logo
clueless 6 hours ago

Given the current llm context size limitation, what is the state of art for feeding large doc/text blobs into llm for accurate processing?

simonw 6 hours ago | parent | next [-]

The current generation of models all support pretty long context now - the Gemini family has had 1m tokens for over a year, GPT-4.1 is 1m, interestingly GPT-5 is back down to 400,000, Claude 4 is 200,000 but there's a mode of Claude Sonnet 4 that can do 1m as well.

The bigger question is how well they perform - there are needle-in-haystack benchmarks that test that, they're mostly scoring quite highly on those now.

https://cloud.google.com/blog/products/ai-machine-learning/t... talks about that for Gemini 1.5.

Here's a couple of relevant leaderboards: https://huggingface.co/spaces/RMT-team/babilong and https://longbench2.github.io/

clueless 6 hours ago | parent [-]

sorry I should have been more clear, I meant around open source llms. and I guess the question is, how are closed source llm doing it so well. And if OS OpenNote is the best we have...

lysecret 6 hours ago | parent | prev [-]

Generally use 2.5 flash for this, works incredibly well. So many traditionally hard things can now we solved by stuffing it into a pretty cheap llm haha.

mekael 4 hours ago | parent [-]

What do you mean by “traditionally hard” in relation to a pdf? Most if not all of the docs I’m tasked with parsing are secured, flattened, and handwritten, which can cause any tool (traditional or ai) to require a confidence score and manual intervention. Also might be that i just get stuck with the edge cases 90% of the time.