Remix.run Logo
djoldman 3 days ago

From table 3 it appears that Deepseek R1 has the highest eval scores.

It's a 607B model vs 405B, so obviously "larger"

fallpeak 3 days ago | parent [-]

[dead]