▲ | ej88 3 days ago | |||||||||||||||||||||||||||||||||||||
The article just isn't that coherent for me. > when a new model is released as the SOTA, 99% of the demand immediately shifts over to it 99% is in the wrong ballpark. Lots of users use Sonnet 4 over Opus 4, despite Opus being 'more' SOTA. Lots of users use 4o over o3 or Gemini over Claude. In fact it's never been a closer race on who is the 'best': https://openrouter.ai/rankings >switch from opus ($75/m tokens) to sonnet ($15/m) when things get heavy. optimize with haiku for reading. like aws autoscaling, but for brains. they almost certainly built this behavior directly into the model weights ??? Overall the article seems to argue that companies are running into issues with usage-based pricing due to consumers not accepting or being used to usage based pricing and it's difficult to be the first person to crack and switch to usage based. I don't think it's as big of an issue as the author makes it out to be. We've seen this play out before in cloud hosting. - Lots of consumers are OK with a flat fee per month and using an inferior model. 4o is objectively inferior to o3 but millions of people use it (or don't know any better). The free ChatGPT is even worse than 4o and the vast majority of chatgpt visitors use it! - Heavy users or businesses consume via API and usage based pricing (see cloud). This is almost certainly profitable. - Fundamentally most of these startups are B2B, not B2C | ||||||||||||||||||||||||||||||||||||||
▲ | margalabargala 3 days ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
> Lots of users use 4o over o3 How much of that is the naming? Personally I just avoid OpenAIs models entirely because I have absolutely no way of telling how their products stack up against one another or which to use for what. In what world does o3 sort higher than 4o? If I have to research your products by name to determine what to use for something that is already a commodity, you've already lost and are ruled out. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | motorest 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> In fact it's never been a closer race on who is the 'best' Thank you for pointing out that fact. Sometimes it's very hard to keep perspective. Sometimes I use Mistral as my main LLM. I know it's not lauded as the top performing LLM but the truth of the matter is that it's results are just as useful as the best models that ChatGPT/Gemini/Claude outputs, and it is way faster. There is indeed diminished returns on the current blend of commercial LLMs. Deep seek already proved that cost can be a major factor and quality can even improve. I think we're very close to see competition based on price, which might be the reason there is so much talk about mixture of experts approaches and how specialized models can drive down cost while improving targeted output. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | tdemin 3 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
[dead] |