▲ | voidspark a day ago | |||||||||||||||||||||||||
But that's exactly what these deep neural networks have shown, countless times. LLM's generalize on new data outside of its training set. It's called "zero shot learning" where they can solve problems that are not in their training set. AlphaGo Zero is another example. AlphaGo Zero mastered Go from scratch, beating professional players with moves it was never trained on > Another is the fundamental inability to self update That's an engineering decision, not a fundamental limitation. They could engineer a solution for the model to initiate its own training sequence, if they decide to enable that. | ||||||||||||||||||||||||||
▲ | no_wizard a day ago | parent | next [-] | |||||||||||||||||||||||||
>AlphaGo Zero mastered Go from scratch, beating professional players with moves it was never trained on Thats all well and good, but it was tuned with enough parameters to learn via reinforcement learning[0]. I think The Register went further and got better clarification about how it worked[1] >During training, it sits on each side of the table: two instances of the same software face off against each other. A match starts with the game's black and white stones scattered on the board, placed following a random set of moves from their starting positions. The two computer players are given the list of moves that led to the positions of the stones on the grid, and then are each told to come up with multiple chains of next moves along with estimates of the probability they will win by following through each chain. While I also find it interesting that in both of these instances, its all referenced to as machine learning, not AI, its also important to see that even though what AlphaGo Zero did was quite awesome and a step forward in using compute for more complex tasks, it was still seeded the basics of information - the rules of Go - and simply patterned matched against itself until built up enough of a statistical model to determine the best moves to make in any given situation during a game. Which isn't the same thing as showing generalized reasoning. It could not, then, take this information and apply it to another situation. They did show the self reinforcement techniques worked well though, and used them for Chess and Shogi to great success as I recall, but thats a validation of the technique, not that it could generalize knowledge. >That's an engineering decision, not a fundamental limitation So you're saying that they can't reason about independently? [0]: https://deepmind.google/discover/blog/alphago-zero-starting-... [1]: https://www.theregister.com/2017/10/18/deepminds_latest_alph... | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | dontlikeyoueith a day ago | parent | prev [-] | |||||||||||||||||||||||||
This comment is such a confusion of ideas its comical. | ||||||||||||||||||||||||||
|