| ▲ | simianwords 5 hours ago | ||||||||||||||||||||||||||||
context window is not some physical barrier but rather the attention just getting saturated. what did i get wrong here? | |||||||||||||||||||||||||||||
| ▲ | qsort 5 hours ago | parent | next [-] | ||||||||||||||||||||||||||||
> what did i get wrong here? You don't know how an LLM works and you are operating on flawed anthropomorphic metaphors. Ask a frontier LLM what a context window is, it will tell you. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | paradite 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
In theory, auto-regressive models should not have limit on context. It should generate the next token with all previous tokens. In practice, when training a model, people select a context window so that during inference, you know how much GPU memory to allocate for a prompt and reject the prompt if it exceeds the memory limit. Of course there's also degrading performance as context gets longer, but I suspect memory limit is the primary factor of why we have context window limits. | |||||||||||||||||||||||||||||
| ▲ | kenjackson 3 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||
I think attention literally doesn't see anything beyond the context window. Even within the context window you may start to see attentional issues, but that's a different problem. | |||||||||||||||||||||||||||||