Remix.run Logo
adastra22 4 days ago

There are plenty of ways of writing more capable software stacks using LLMs which don’t rely on reinforcement learning. If anything, the AI labs have too much of a focus on building larger models with bigger or better sets of labeled data, where algorithmic changes will let you do more with the same tools.

tibbar 4 days ago | parent [-]

To slightly amend my original take, much of this data is specifically for evals. That is, the labs need some sort of data set and grading scheme by which they can measure and direct their progress.