▲ | GistNoesis 7 days ago | |
>"just train it until it is perfect" Yes that's exactly the problem with current approach based on a "valuation" function. They are not trying to aim for perfection, and therefore cannot make progress anymore. To progress you must precisely define what is the frontier : an evaluation of 0.1 is not resolved to one of "white win", "draw", "white lose" which they theoretically must be. They are not "committing" to anything. To train such a network to perfection you must avoid training your neural network for the "average" game state, but rather also train for "hard mining samples", game states which define the frontier. Find a candidate, find a violation, add to dataset of training examples, Retrain to perfection on a growing dataset, (or a generator of hard positions) to find a new candidate and Loop. | ||
▲ | WJW 7 days ago | parent [-] | |
So what makes you think it is possible to precisely define such a frontier? And why should such a thing, if it is possible at all, be 1. doable by humans and 2. doable with the amount of energy and computing power available to us within the coming couple of decades? |