| ▲ | GistNoesis 7 days ago |
| >This comment and your other comments are simply wrong and full of nonsense That's because you are not understanding me. My conviction is the complexity of chess is not as high as we think, and that there exist a neural network of less than 10B float32 weights which can encode perfect play. Neural network evaluations are now used as heuristic to evaluate the position and even without tree searching they play very well, but they are usually very small and complemented by a few high level features as input. A bigger network, well fine-tuned can probably reach perfect play without the need of tree searching. A thought exercise is trying to compress endgame table with a neural network and see how big we need it to be in order to reach perfect play. The thing is : you don't need to train it on all games from the endgame table before it converges to perfect play. You can know how close you are to the optimal, by counting the number of Bellman equation violation (or not observing them) You can even train it by having it referencing the previously trained oracle of endgame table chess. You solve chess with a neural network when there are only 2 pieces. Then you solve chess for 3 pieces eventually using the chess for 2 pieces oracle. Then you solve chess for 4 pieces using the chess for 3 pieces oracle, and so on ... until you reach chess for 32 pieces. Adding pieces up only increase complexity up to a point. |
|
| ▲ | jll29 7 days ago | parent | next [-] |
| >the complexity of chess is not as high as we think, and that there exist a neural network of less than 10B float32 weights which can encode perfect play. That is certainly not the mainstream view, so you will have to support your conjecture by some evidence if you would like to convince people that you are right (or demonstrate it empirically end-to-end). |
| |
| ▲ | GistNoesis 6 days ago | parent | next [-] | | My conviction is based on the interesting phenomenon of game tree collapsing. When you get better at the game, the number of "candidate" moves you need to explore goes down. For example when you are naive at chess you need to explore all legal moves, and then all legal moves response to these moves. The number of nodes in this tree grows exponentially, and the tree depth is rather limited to a small depth. But when your list of "candidate" move is reduced to size 1. The number of nodes in the tree is equal to its depth. You can "unroll the line very cheaply". (That's the concept of lines in chess. In go that's the concept of "ladders"). When your candidate list is of size 2, the game tree is of size 2^(depth+1). You can have fractional candidate list size, by only needing to explore down the tree some times, or having the need to explore only 2 candidate some times. The complexity grows as (Fractional Candidate List Size) ^ (depth + 1 ). With FCLS being between 1 and 2. There is a transition phase which occurs, and allows to go much deeper in the game tree which simplify things and make things tractable (because you then reach endgame tables or an equivalence class number (aka you don't necessarily know the result yet you just know it will be the same result as all these other types of games) ). So your function is allowed to have a smaller inner function called recursively to explore the game tree of candidate moves within a computational budget, to help make a decision. The upper function resolve the equivalence number to their "win" "draw" "lose" value always in the same way and takes the appropriate maximum (remember your network only goal is to be always consistent). The better your network get at managing his computational budget the deeper it can go. You can think of it as a version of alpha-beta pruning with improving heuristics. | |
| ▲ | jibal 7 days ago | parent | prev [-] | | But ... but ... it's his conviction. |
|
|
| ▲ | jibal 7 days ago | parent | prev | next [-] |
| I do understand you, but you and your "conviction" are wrong. Apparently you aren't even familiar with AlphaZero. > Adding pieces up only increase complexity up to a point. As you said, until you reach 32 pieces. What you vastly underestimate is how much complexity is added at each level. You're like the king in the fable who agreed to give the vizier a small amount of grain: 1 grain for the first square, 2 grains for the second square, 4 grains for the third square, etc. The king thought he was getting a bargain. > The thing is : you don't need to train it on all games from the endgame table before it converges to perfect play. But you do, because there is no algorithmic simplification, at all. Strong chess players understand that while there are common patterns throughout chess, their application is highly specific to the position. That's why we have endgame tables, which are used to solve positions that pattern matching doesn't solve. You can get excellent play out of an NN, but that's not the same as solving it. And the absence of Bellman violations is necessary, but not sufficient ... you can't use it to prove that you've solved chess. The fact is that it is impossible within pragmatic limits to prove that chess has been solved. But so what? Programs like AlphaZero and Stockfish already play well enough for any purpose. Anyway, you're free to go implement this ... good luck. I won't respond further. |
| |
| ▲ | GistNoesis 7 days ago | parent [-] | | >the absence of Bellman violations is necessary, but not sufficient It is sufficient though. All chess game ends in a finite number of moves. If you are consistent aka you have zero violation. Thinking backward (dynamic programming), you can "color" correctly final positions. And you can color correctly all position at 1 turn before end because you are consistent. Then 2 turn before ends,... Then recursively you have correctly colored all chess positions. You are missing the complexity collapse which can occur in games, like for example the game of Nim, where a simple function can predict the outcome. When you have 15 sticks and you can remove any 3, naively one would think that there are 2 ^ 15 game states and "15 choose 3" legal game moves by turn, but in fact there are equivalence classes, which mean the game state is reduced to 15 different states only. Modulo the prism of a trained neural network, game states got grouped into equivalence classes and the same phenomenon occur in chess, which allow simplification by high level rules like white wins because white wins the pawn race. I am more familiar with Stockfish than AlphaZero or LeelaChess0, but the Stockfish demonstrates that well engineered features can bring down the neural network size a lot. In particular counting usually poses problem to neural networks, and counting like how many moves before the 50 moves rule or number of moves before a pawn race are edge cases that can be simplified (DTZ, and DTM). Also these engines are trying to compress the evaluation function which is a lot more information than just whether the position is win, draw or loss, aka just the frontier. | | |
| ▲ | jibal 7 days ago | parent [-] | | > It is sufficient though. All chess game ends in a finite number of moves. Again, the issue is the size of that number, not that it is finite. > You are missing the complexity collapse which can occur in games, like for example the game of Nim All you have is ad hominems ... "you don't understand", "you are missing" ... I'm not missing the intellectual dishonesty of obviously irrelevant examples like Nim, Rubik's cube, or cryptanalysis that have clear mathematical regularities completely lacking from chess. Now stop insulting me and my intelligence. Again, if you want to go implement this, or write a journal paper, then have at it. But this "we" should have solved chess already, and "we" should be aiming to solve chess, but "we" are not even trying is arrogant trolling. Over and out. |
|
|
|
| ▲ | throw-qqqqq 7 days ago | parent | prev [-] |
| > My conviction is the complexity of chess is not as high as we think, and that there exist a neural network of less than 10B float32 weights which can encode perfect play Given the number of possible games (numbers above 10^80) you would need EXTREME sparsity to encode it in less than 10B / 10^10 params. Sounds information theoretically impossible to me ¯\_(ツ)_/¯ Leela Chess Zero has hundreds of millions to few billion parameters AFAIK. The argument about the game being finite and thus solvable is misguided IMO. AES encryption is also finite, and you can enumerate all possible combinations, but not before the end of time.. |