| ▲ | data_maan 2 hours ago | |
> these are problems of some practical interest, not just performative/competitive maths. FrontierMath did this a year ago. Where is the novelty here? > a solution is known, but is guaranteed to not be in the training set for any AI. Wrong, as the questions were poses to commercial AI models and they can solve them. This paper violates basic benchmarking principles. | ||
| ▲ | offnominal 30 minutes ago | parent [-] | |
> Wrong, as the questions were poses to commercial AI models and they can solve them. Why does this matter? As far as I can tell, because the solution is not known this only affects the time constant (i.e. the problems were known for longer than a week). It doesn't seem that I should care about that. | ||