Remix.run Logo
babel_ 4 days ago

AlphaZero may have the rules built in, but MuZero and the other follow-ups didn't. MuZero not only matched or surpassed AlphaZero, but it did so with less training, especially in the EfficientZero variant; notably also on the Atari playground.

gavmor 4 days ago | parent | next [-]

This is "The Bitter Lesson" of AI, no? "More compute beats clever algorithm."

adastra22 3 days ago | parent | next [-]

> MuZero not only matched or surpassed AlphaZero, but it did so with less training

Seems the opposite?

babel_ 3 days ago | parent | prev [-]

Quite the opposite, a clever algorithm needs less compute, and can leverage extra compute even more.

gavmor 3 days ago | parent [-]

Apologies, "clever" is a poor paraphrase of "domain-specific", or "methods that leveraged human understanding."[0]

0. http://www.incompleteideas.net/IncIdeas/BitterLesson.html

smokel 4 days ago | parent | prev [-]

Thanks for pointing that out.

To be fair, MuZero only learns a model of the rules for navigating its search tree. To make actual moves, it gets a list of valid actions from the game engine, so at that level it does not learn the rules of the game.

(HRM possibly does the same, and could be in the same realm as MuZero. It probably makes a lot of illegal moves.)