▲ | IAmNotACellist 4 days ago | |
This doesn't seem noteworthy. It's called a context window for a reason--because the input is considered context. You could train an LLM to consider the context potentially adversarial or irrelevant, and this phenomenon would go away, at the expense of the LLM sometimes considering real context to be irrelevant. To me, this observation sounds as trite as: "randomly pressing a button while inputting a formula on your graphing calculator will occasionally make the graph look crazy." Well, yeah, you're misusing the tool. | ||
▲ | devmor 4 days ago | parent | next [-] | |
It sounds important to me. Humans are where context comes from. Humans do not generally provide 100% relevant context but are generally pretty good at identifying irrelevant context that they've been given. It seems to me that solving this problem is one approach to removing the need for "prompt engineering" and creating models that can better interpret prompts from people. Remember that what they're trying to create here isn't a graphing calculator - they want something conversationally indistinguishable from a human. | ||
▲ | nomel 4 days ago | parent | prev [-] | |
This should be more of a problem for agents, with less bound context. But, I would claim it’s a problem for a common use case if LLM of “here’s my all my code, add this feature and fix this”. How much of that code is irrelevant to the problem? Probably most of it. |