▲ | throwaway314155 3 days ago | |
I don't think training on code and training on chess are even remotely comparable in terms of available data and linguistic competency required. Coding (in the general case, which is what these models try to approach) is clearly the harder task and contains _massive_ amounts of diverse data. Having said all of that, it wouldn't surprise me if the "language to world model" thesis you reference is indeed wrong. But I don't think a model that plays chess well disproves it, particularly since there are chess engines using old fashioned approaches that utterly destroy LLM's. |