▲ | ruslan_sure 6 days ago | |
I don't think it's helpful to put words in the LLM's mouth. To properly think about that, we need to describe how an LLM thinks. It doesn't think in words or move vague, unwieldy concepts around and then translate them into words, like humans do. It works with words (tokens) and their probability of appearing next. The main thing is that these probabilities represent the "thinking" that was initially behind the sentences with such words in its training set, so it manipulates words with the meaning behind them. Now, to your points: 1) Regarding adding more words to the context window, it's not about "more"; it's about "enough." If you don't have enough context for your task, how will you accomplish it? "Go there, I don't know where." 2) Regarding "problem solved," if the LLM suggests or does such a thing, it only means that, given the current context, this is how the average developer would solve the issue. So it's not an intelligence issue; it's a context and training set issue! When you write that "software engineers can step back, think about the whole thing, and determine the root cause of a problem," notice that you're actually referring to context. If the you don't have enough context or a tool to add data, no developer (digital or analog) will be able to complete the task. | ||
▲ | adastra22 6 days ago | parent [-] | |
> It doesn't think in words or move vague, unwieldy concepts around and then translate them into words, like humans do. That seems to me like a perfectly fine description of state space & chain of though continuation. |