Remix.run Logo
GistNoesis 7 days ago

>"just train it until it is perfect"

Yes that's exactly the problem with current approach based on a "valuation" function.

They are not trying to aim for perfection, and therefore cannot make progress anymore.

To progress you must precisely define what is the frontier : an evaluation of 0.1 is not resolved to one of "white win", "draw", "white lose" which they theoretically must be. They are not "committing" to anything.

To train such a network to perfection you must avoid training your neural network for the "average" game state, but rather also train for "hard mining samples", game states which define the frontier.

Find a candidate, find a violation, add to dataset of training examples, Retrain to perfection on a growing dataset, (or a generator of hard positions) to find a new candidate and Loop.

WJW 7 days ago | parent [-]

So what makes you think it is possible to precisely define such a frontier? And why should such a thing, if it is possible at all, be 1. doable by humans and 2. doable with the amount of energy and computing power available to us within the coming couple of decades?