| ▲ | artninja1988 2 hours ago | |||||||
I think sakanas papers are one of the more creative, not just gunning for incremental benchmark improvements. But yeah I agree that they can be a bit (or very) hypey. But regardless, I want to see more of their kind of research than endless benchmark chasing. All the best to David Ha and the team! | ||||||||
| ▲ | laughingcurve an hour ago | parent [-] | |||||||
Can you explain how 'recursive self-improvement' functions without 'endless benchmark chasing'? I mean, RSI is literally that. What do you think they're improving on? How would a model self-improve without some metric/data of some kind to check? When you have metrics+data, that is a benchmark. And yes, simulations and or soft-verification like LLM judges are still a kind of benchmarking. Maybe its not a static benchmark they can easily hack. Folks -- RSI does not mean the self-improvement is them going to therapy and seeking inner peace to overcome trauma. | ||||||||
| ||||||||