Remix.run Logo
covi 3 hours ago

This feels like the chimpanzee with a power drill. An agent is honestly just brute-force search, but guided.

chaos_emergent 2 hours ago | parent | next [-]

Human-driven research is also brute-force but with a more efficient search strategy. One can think of a parameter that represents research-search-space-navigation efficiency. RL-trained agents will inevitably optimize for that parameter. I agree with your statement insomuch as the value of that efficiency parameter is lower for agents than humans today.

It's really hard to imagine that they __won't__ exceed the human value for that efficiency parameter rather soon given that 1. there are plenty of scalar value functions that can represent research efficiency, of which a subset will result in robust training, and 2. that AI labs have a massive incentive to increase their research efficiency overall, along with billions of dollars and really good human researchers working on the problem.

groby_b 2 hours ago | parent | prev | next [-]

Is there anything in the research space that doesn't fit "brute-force search, but guided"?

All of science is "gather inputs, make hypothesis, test, analyse" on repeat.

There's plenty to critique in the particular guidance approach, but the overall method is the same.

gwern an hour ago | parent | prev | next [-]

Except the power drill isn't being used to make a better chimpanzee.

2 hours ago | parent | prev [-]
[deleted]