| ▲ | fooker 6 hours ago | |||||||
They say hill climbing https://microsoft.ai/news/building-a-hillclimbing-machine-la... Unless they specifically clarify that the testing and training benchmarks are completely separate, we have to assume they test on the same 'hill' the model climbs. | ||||||||
| ▲ | artemisart 5 hours ago | parent | next [-] | |||||||
Hill climbing doesn't mean much but absolutely doesn't imply they cheat on benchmarks. They have more details here https://microsoft.ai/news/introducing-mai-thinking-1/ it seems to be "RL on everything". | ||||||||
| ||||||||
| ▲ | jongalloway2 6 hours ago | parent | prev | next [-] | |||||||
[dead] | ||||||||
| ▲ | ajyoon 6 hours ago | parent | prev [-] | |||||||
[flagged] | ||||||||