▲ | fc417fc802 a day ago | |||||||||||||||||||||||||||||||||||||||||||||||||
Perhaps I misunderstand your point but it seems to me that by the same logic a simple gradient descent algorithm wired up to a variety of different models and simulations would qualify as generalization during the training phase. The trouble with this is that it only ever "generalizes" approximately as far as the person configuring the training run (and implementing the simulation and etc) ensures that it happens. In which case it seems analogous to an explicitly programmed algorithm to me. Even if we were to accept the training phase as a very limited form of generalization it still wouldn't apply to the output of that process. The trained LLM as used for inference is no longer "learning". The point I was trying to make with the chess engine was that it doesn't seem that generalization is required in order to perform that class of tasks (at least in isolation, ie post-training). Therefore, it should follow that we can't use "ability to perform the task" (ie beat a human at that type of board game) as a measure for whether or not generalization is occurring. Hypothetically, if you could explain a novel rule set to a model in natural language, play a series of several games against it, and following that it could reliably beat humans at that game, that would indeed be a type of generalization. However my next objection would then be, sure, it can learn a new turn based board game, but if I explain these other five tasks to it that aren't board games and vary widely can it also learn all of those in the same way? Because that's really what we seem to mean when we say that humans or dogs or dolphins or whatever possess intelligence in a general sense. | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | voidspark 21 hours ago | parent [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
You're muddling up some technical concepts here in a very confusing way. Generalization is the ability for a model to perform well on new unseen data within the same task that it was trained for. It's not about the training process itself. Suppose I showed you some examples of multiplication tables, and you figured out how to multiply 19 * 42 without ever having seen that example before. That is generalization. You have recognized the underlying pattern and applied it to a new case. AlphaGo Zero trained on games that it generated by playing against itself, but how that data was generated is not the point. It was able to generalize from that information to learn deeper principles of the game to beat human players. It wasn't just memorizing moves from a training set. > However my next objection would then be, sure, it can learn a new turn based board game, but if I explain these other five tasks to it that aren't board games and vary widely can it also learn all of those in the same way? Because that's really what we seem to mean when we say that humans or dogs or dolphins or whatever possess intelligence in a general sense. This is what LLMs have already demonstrated - a rudimentary form of AGI. They were originally trained for language translation and a few other NLP tasks, and then we found they have all these other abilities. | ||||||||||||||||||||||||||||||||||||||||||||||||||
|