| ▲ | Towards Autonomous Mathematics Research(arxiv.org) | |||||||||||||||||||
| 41 points by gmays 2 hours ago | 8 comments | ||||||||||||||||||||
| ▲ | u1hcw9nx 29 minutes ago | parent | next [-] | |||||||||||||||||||
>The results of this paper should not be interpreted as suggesting that AI can consistently solve research-level mathematics questions. In fact, our anecdotal experience is the opposite: success cases are rare, and an apt intuition for autonomous capabilities (and limitations) may currently be important for finding such cases. The papers (ACGKMP26; Feng26; LeeSeo26) grew out of spontaneous positive outcomes in a wider benchmarking effort on research-level problems; for most of these problems, no autonomous progress was made. | ||||||||||||||||||||
| ▲ | amiune 2 hours ago | parent | prev | next [-] | |||||||||||||||||||
Perfect match for this test: https://arxiv.org/abs/2602.05192 | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | measurablefunc 2 hours ago | parent | prev [-] | |||||||||||||||||||
I still don't get how achieving 96% on some benchmark means it's a super genius but that last 4% is somehow still out of reach. The people who constantly compare robots to people should really ponder how a person who manages to achieve 90% on some advanced math benchmark still misses that last 10% somehow. | ||||||||||||||||||||
| ||||||||||||||||||||