| ▲ | vannevar 14 hours ago | ||||||||||||||||
Two issues with this. One, it's profitable assuming you just keep serving the same model forever, which is not realistic in this market. A given model has a shelf-life, which these days is measured in months, not years. Which means that trying to separate the cost of training the model from the cost of serving it doesn't make much business sense. And two, for providers that provide inference only via open weight models, the margins quickly move to commoditization. The "someday" when frontier model providers can enjoy their current high inference margins without the burden of significant training costs is never going to arrive. | |||||||||||||||||
| ▲ | skybrian 13 hours ago | parent | next [-] | ||||||||||||||||
Commoditization means there's price competition. From a consumer perspective, that's good. You want it to be a low-margin, high volume, competitive business. Although from a business perspective, it can end up being ruinous competition like solar panels or airlines. A stable equilibrium with prices neither too low or too high isn't guaranteed; it depends on market structure. It's anyone's guess whether this reaches an equilibrium or not, but I still expect that there will be companies like OpenRouter and Fireworks that offer inference at reasonable prices. | |||||||||||||||||
| |||||||||||||||||
| ▲ | credit_guy 8 hours ago | parent | prev [-] | ||||||||||||||||
> A given model has a shelf-life, which these days is measured in months, not years. Not all new models are trained from scratch. ChatGPT 5.3 to 5.4 (and likely 5.5) was basically the same model, but probably trained a bit more, not a new model from scratch. > The "someday" when frontier model providers can enjoy their current high inference margins without the burden of significant training costs is never going to arrive. That is debatable. I believe the moat for the frontier model providers is the compute. At the level of 10 trillion parameters (that Fable/Mythos are rumored to have), you need serious compute to serve inference, and you also need serious compute to train. Will DeepSeek, Qwen, Kimi, GLM come up with a 10T new model anytime soon? I doubt that. People keep saying that the Chinese labs are catching up to the US big 3, and measured in months the gap is now only 4-6 months. I doubt a Chinese version of Fable/Mythos will be released in the next 12 months. | |||||||||||||||||