▲ | _diyar 6 days ago | |
We expect this approach to work because it's currently the best working approach. Nothing else comes close. Using symbolic language is a good idea in theory, but in practice it doesn't scale as well as auto-regression + RL. The IMO results of DeepMind illustrate this well: In 2024, they solved it using AlphaProof and AlphaGeometry, using the Lean language as a formal symbolic logic[1]. In 2025 they performed better and faster by just using a fancy version of Gemini, only using natural language[2]. [1] https://deepmind.google/discover/blog/ai-solves-imo-problems... [2] https://deepmind.google/discover/blog/advanced-version-of-ge... Note: I agree with the notion of the parent comment that letting the models reason in latent space might make sense, but that's where I'm out of my depth. | ||
▲ | xg15 5 days ago | parent | next [-] | |
The example is a bit of an unfair comparison though. It takes questions already posed in natural language and, as far as I can tell, expects results in natural language, too. This means that whatever system is evaluated in this challenge necessarily has to deal with natural language. And indeed, a big part of the AlphaProof system was a neural network to convert from natural language to Lean. None of this has anything to do with reasoning ability. I think it would be interesting to present an inverse challenge, where the problems are already posed in a formal language. Would a network that first converts them into natural language, then does chain-of-thought on that, then translates the result back into formal language still be better than a simple symbolic reasoner that could operate on the formal language directly? | ||
▲ | safety1st 6 days ago | parent | prev [-] | |
Very interesting stuff, thanks! |