Remix.run Logo
onlyrealcuzzo 14 hours ago

All of the methods you described rely on deterministic paths.

GRAM is unique AFAIK in that it's exploring probabilistic paths.

AFAIK, the deterministic path exploration was nowhere near as impressive as GRAM in terms of reasoning benefits.

GRAM is reasoning better than models 2000-10,000x its size. Deterministic models were 2x-10x improvements.

Naively, GRAM seems to be applying to LLMs what LeCun wants to do with JEPA and World Models.