▲ | noodletheworld 5 days ago | |||||||||||||||||||||||||||||||||||||
Huh. I feel oddly skeptical about this article; I can't specifically argue the numbers, since I have no idea, but... there are some decent open source models; they're not state of the art, but if inference is this cheap then why aren't there multiple API providers offering models at dirt cheap prices? The only cheap-ass providers I've seen only run tiny models. Where's my cheap deepseek-R1? Surely if its this cheap, and we're talking massive margins according to this, I should be able to get a cheap / run my own 600B param model. Am I missing something? It seems that reality (ie. the absence of people actually doing things this cheap) is the biggest critic of this set of calculations. | ||||||||||||||||||||||||||||||||||||||
▲ | dragonwriter 5 days ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
> but if inference is this cheap then why aren't there multiple API providers offering models at dirt cheap prices There are multiple API providers offering models at dirt cheap prices, enough so that there is at least one well-known API provider that is an aggreggator of other API providers that offers lots of models at $0. > The only cheap-ass providers I've seen only run tiny models. Where's my cheap deepseek-R1? | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | jsnell 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> why aren't there multiple API providers offering models at dirt cheap prices? There are. Basically every provider's R1 prices are cheaper than estimated by this article. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | colinsane 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> I should be able to get a cheap / run my own 600B param model. if the margins on hosted inference are 80%, then you need > 20% utilization of whatever you build for yourself for this to be less costly to you (on margin). i self-host open weight models (please: deepseek et al aren't open _source_) on whatever $300 GPU i bought a few years ago, but if it outputs 2 tokens/sec then i'm waiting 10 minutes for most results. if i want results in 10s instead of 10m, i'll be paying $30000 instead. if i'm prompting it 100 times during the day, then it's idle 99% of the time. coordinating a group buy for that $30000 GPU and sharing that across 100 people probably makes more sense than either arrangement in the previous paragraph. for now, that's a big component of what model providers, uh, provide. | ||||||||||||||||||||||||||||||||||||||
▲ | brokencode 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
I also have no idea on the numbers. But I do know that these same companies are pouring many billions of dollars into training models, paying very expensive staff, and building out infrastructure. These costs would need to be factored in to come up with the actual profit margins. | ||||||||||||||||||||||||||||||||||||||
▲ | martinald 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
There are, I screenshotted DeepInfra in the article, but there are a lot more https://openrouter.ai/deepseek/deepseek-r1-0528 | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | hirako2000 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
Imo the article is totally off the mark since it assumes users on average do not go over th 1M tokens per day. Afaik openai doesn't enforce a daily quota even on the $20 plans unless the platform is under pressure. Since I often consume 20M token per day, one can assume many would use far more than the 1M tokens assumed in the article's calculations. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | GaggiX 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
https://openrouter.ai/deepseek/deepseek-chat-v3.1 They are dirt cheap. Same model architecture for the comparison: $0.30/M $1.00/M. Or even $0.20-$0.80 from another provider. | ||||||||||||||||||||||||||||||||||||||
▲ | johnsmith1840 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
Another giant problem with this article is we have no idea the optimizations used on their end. There are some widly complex optimizations these large AI companies use. What I'm trying to say is that hosting your own model is in an entierly different leauge than the pros. If we account for error in article implies higher cost I would argue it would return back to profit directly because how advanced optimization of infer3nce has become. If actual model intelligence is not a moat (looking likely this is true) the real sauce of profitable AI companies is advanced optimizations across the entire stack. Openai is NEVER going to release their specialized kernels, routing algos, quanitizations or model comilation methods. These are all really hard and really specific. | ||||||||||||||||||||||||||||||||||||||
▲ | paulddraper 5 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
I would not be surprised if the operating costs are modest But these companies also have very expensive R&D development and large upfront costs. | ||||||||||||||||||||||||||||||||||||||
▲ | jedberg 5 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
Deepseek R1 for free. | ||||||||||||||||||||||||||||||||||||||
|