Remix.run Logo
BarryMilo 13 hours ago

I seem to remember that's one of the first things they tried, but the general models tended to win out. Turns out there's more to learn from all code/discussions than from just JS.

justinlivi 5 hours ago | parent [-]

From my own empirical research, the generalized models acting as specialists outperform both the tiny models acting as specialists and the generalist models acting as generalists. It seems that if peak performance is what you're after, then having a broad model act as several specialized models is the most impactful.