Remix.run Logo
hackinthebochs 4 hours ago

>AlphaGo didn't teach itself that move. The verifier taught AlphaGo that move.

No. AlphaGo developed a heuristic by playing itself repeatedly, the heuristic then noticed the quality of that move in the moment.

Heuristics are the core of intelligence in terms of discovering novelty, but this is accessible to LLMs in principle.