Remix.run Logo
babelfish 2 days ago

Wow, 30B parameters as capable as a 1T parameter model?

mhitza 2 days ago | parent [-]

On the above compared benchmarks is closer to other larger open weights models, and on par with GPT-OSS 120B, for which I also have a frame of reference.