▲ | ants_everywhere 3 days ago | |||||||||||||||||||||||||||||||||||||
I exhaust the 1 million context windows on multiple models multiple times per day. I haven't used the Llama 4 10 million context window so I don't know how it performs in practice compared to the major non-open-source offerings that have smaller context windows. But there is an induced demand effect where as the context window increases it opens up more possibilities, and those possibilities can get bottlenecked on requiring an even bigger context window size. For example, consider the idea of storing all Hollywood films on your computer. In the 1980s this was impossible. If you store them in DVD or Bluray quality you could probably do it in a few terabytes. If you store them in full quality you may be talking about petabytes. We recently struggled to get a full file into a context window. Now a lot of people feel a bit like "just take the whole repo, it's only a few MB". | ||||||||||||||||||||||||||||||||||||||
▲ | brulard 3 days ago | parent [-] | |||||||||||||||||||||||||||||||||||||
I think you misunderstand how context in current LLMs works. To get the best results you have to be very careful to provide what is needed for immediate task progression, and postpone context thats needed later in the process. If you give all the context at once, you will likely get quite degraded output quality. Thats like if you want to give a junior developer his first task, you likely won't teach him every corner of your app. You would give him context he needs. It is similar with these models. Those that provided 1M or 2M of context (Gemini etc.) were getting less and less useful after cca 200k tokens in the context. Maybe models would get better in picking up relevant information from large context, but AFAIK it is not the case today. | ||||||||||||||||||||||||||||||||||||||
|