▲ | dylan604 10 days ago | |||||||
One of these things is not like the others. $8.50/1000?? Any chance that's a typo? Otherwise, for someone that has no experience with LLM pricing models, why is Llama 90b so expensive? | ||||||||
▲ | int_19h 10 days ago | parent | next [-] | |||||||
It's not uncommon when using brokers to see outliers like this. What happens basically is that some models are very popular and have many different providers, and are priced "close to the metal" since the routing will normally pick the cheapest option with the specified requirements (like context size). But then other models - typically more specialized ones - are only hosted by a single provider, and said provider can then price it much higher than raw compute cost. E.g. if you look at https://openrouter.ai/models?order=pricing-high-to-low, you'll see that there are some 7B and 8B models that are more expensive than Claude Sonnet 3.7. | ||||||||
| ||||||||
▲ | themanmaran 10 days ago | parent | prev [-] | |||||||
That was the cost when we ran Llama 90b using TogetherAI. But it's quite hard to standardize, since it depends a lot on who is hosting the model (i.e. together, openrouter, grok, etc.) I think in order to run a proper cost comparison, we would need to run each model on an AWS gpu instance and compare the runtime required. |