There's one thing you're missing - inference is not cheap. HW is not cheap. Electricity is not cheap. This is w/o R&D. They show that, in average, they recorded ~2.627B requests/day. This is ~79B requests/month or ~948B requests/year. And this is only for the consumer ChatGPT data, Enterprise isn't included AFAICT. Each request translates to the direct cost that could be even roughly estimated.

▲

og_kalu 4 hours ago | parent [-]

No inference is pretty cheap, and a lot of things point to that being true.

- Prices of API access of Open models from third-party providers who would have no motive to subsidize inference

- Google says their median query is about as expensive as a google search

Thing is, what you're saying would have been true a few years ago. This would have all been intractable. But llm inference costs have quite literally been slashed several orders of magnitude in the last couple of years.

▲

menaerus 3 hours ago | parent [-]

You would probably understand if you knew how LLMs are run in the first place but, as ignorant as you are (sorry), I have no interest in debating this with you anymore. I tried to give a tractable clue which you unfortunately chose to counter-argue with non-facts.

▲

og_kalu 3 hours ago | parent [-]

Touting the requests per day is pretty meaningless without per query numbers, but sure, I'm the one that doesn't understand. What people with no incentive to subsidize are charging is about as fact as it comes but sure, lol.

I've replied to you once man. Feel free to disengage but let's not act like this has been some ongoing debate. No need to make up stories.

▲

menaerus 2 hours ago | parent [-]

Which is why I said that it can be roughly estimated. And it can be roughly estimated even without those numbers assuming a fleet of some size X and assuming the number of hours this fleet is utilized per day, for the whole year. Either way, you will end up with a hefty number. Do the math and you'll see that inference is far from being cheap.

	▲	og_kalu 2 hours ago \| parent [-]
		Of course all the requests is a hefty number, they're serving nearly a billion weekly active users. What else would you expect ? Google search, Facebook - those would all be hefty numbers. The point is that inference is pretty cheap per user, so when they get around to implementing ads, they'll be profitable. Again, there are many indicators that inference per user is cheap. Even the sheer fact that Open AI closed 2024 serving hundreds of millions of users and lost 'only' 5B is a pretty big clue that inference is not that expensive.