Remix.run Logo
Super human Stratego with RL and test time search(arxiv.org)
1 points by algo_trader 14 hours ago | 1 comments
algo_trader 14 hours ago | parent [-]

Only 2000 GPU hours Heavily customized network 95% win rate in recent human tournament sample Several training techniques for evaluation/learning rate