Remix.run Logo
lucasay 15 hours ago

This feels less like automated research and more like structured trial and error with a decent feedback loop. Still useful, but I think the real bottleneck is how good your eval metric is. If that’s weak, the whole loop just optimizes for the wrong thing faster.

Almondsetat 13 hours ago | parent | next [-]

Designing a good fitness function, a tale as old as time...

kridsdale1 15 hours ago | parent | prev [-]

I mean, isn’t that “the scientific method”?

lucasay 13 hours ago | parent [-]

Partially—but science also questions the hypothesis and the metric. This mostly assumes both are correct and just optimizes within that box.