▲ | clueless 6 hours ago | |||||||
Given the current llm context size limitation, what is the state of art for feeding large doc/text blobs into llm for accurate processing? | ||||||||
▲ | simonw 6 hours ago | parent | next [-] | |||||||
The current generation of models all support pretty long context now - the Gemini family has had 1m tokens for over a year, GPT-4.1 is 1m, interestingly GPT-5 is back down to 400,000, Claude 4 is 200,000 but there's a mode of Claude Sonnet 4 that can do 1m as well. The bigger question is how well they perform - there are needle-in-haystack benchmarks that test that, they're mostly scoring quite highly on those now. https://cloud.google.com/blog/products/ai-machine-learning/t... talks about that for Gemini 1.5. Here's a couple of relevant leaderboards: https://huggingface.co/spaces/RMT-team/babilong and https://longbench2.github.io/ | ||||||||
| ||||||||
▲ | lysecret 6 hours ago | parent | prev [-] | |||||||
Generally use 2.5 flash for this, works incredibly well. So many traditionally hard things can now we solved by stuffing it into a pretty cheap llm haha. | ||||||||
|