| ▲ | Super human Stratego with RL and test time search(arxiv.org) | |
| 1 points by algo_trader 14 hours ago | 1 comments | ||
| ▲ | algo_trader 14 hours ago | parent [-] | |
Only 2000 GPU hours Heavily customized network 95% win rate in recent human tournament sample Several training techniques for evaluation/learning rate | ||