Remix.run Logo
drob518 3 hours ago

As a publicity stunt, releasing a 300B open model is pretty smart. You can talk about its strong performance and it being “open” and “available,” but it’s so large that most people can’t use it themselves and might try out the cloud-based offering.

zozbot234 3 hours ago | parent [-]

The large models are actually MoE these days so they're usable on ordinary hardware with weights streaming from SSD, just very slow. You're nonethess right that it makes the cloud-based offering more popular, since you can use that for convenience after testing a few inferences locally.