Remix.run Logo
hugh-avherald 3 hours ago

I suspect it's reached the point where the distinguishing quality of one model over the others is only observable by true experts -- and only in their respective fields. We are exhausting the well of frontier questions that can be programmatically asked and the answers checked.

hodgehog11 3 hours ago | parent [-]

Absolutely this. Strong disagree that progress is plateauing, merely that gains are harder for the general public to perceive and typically come from more advanced means than simply scaling. Math performance in particular is improving at an uncomfortably rapid pace.