Remix.run Logo
alexlitz 3 hours ago

I imagine getting things to be polysemantic in a way that does not interfere would lead to sublinear scaling. Also there are smaller ones that were trained so would still be more like 311/36 ~= 8.6.

Lerc 2 hours ago | parent | next [-]

>I imagine getting things to be polysemantic in a way that does not interfere would lead to sublinear scaling.

True, but with even smarter humans, you could exploit the interactions for additional calculations.

While it sounds a bit silly, it is one of the hypotheses behind a fast takeoff. An AI that is sufficiently smart could design a network better than a trained one and could make something much smarter than itself on the same hardware. The question then becomes if that new smarter one can do an even better job. I suspect diminishing returns, but then again I am insufficiently smart.

sowbug 3 hours ago | parent | prev [-]

Thanks!

(I see the Trained Weights results now, thanks.)