Remix.run Logo
menaerus 5 hours ago

You would probably understand if you knew how LLMs are run in the first place but, as ignorant as you are (sorry), I have no interest in debating this with you anymore. I tried to give a tractable clue which you unfortunately chose to counter-argue with non-facts.

og_kalu 5 hours ago | parent [-]

Touting the requests per day is pretty meaningless without per query numbers, but sure, I'm the one that doesn't understand. What people with no incentive to subsidize are charging is about as fact as it comes but sure, lol.

I've replied to you once man. Feel free to disengage but let's not act like this has been some ongoing debate. No need to make up stories.

menaerus 4 hours ago | parent [-]

Which is why I said that it can be roughly estimated. And it can be roughly estimated even without those numbers assuming a fleet of some size X and assuming the number of hours this fleet is utilized per day, for the whole year. Either way, you will end up with a hefty number. Do the math and you'll see that inference is far from being cheap.

og_kalu 4 hours ago | parent [-]

Of course all the requests is a hefty number, they're serving nearly a billion weekly active users. What else would you expect ? Google search, Facebook - those would all be hefty numbers. The point is that inference is pretty cheap per user, so when they get around to implementing ads, they'll be profitable.

Again, there are many indicators that inference per user is cheap. Even the sheer fact that Open AI closed 2024 serving hundreds of millions of users and lost 'only' 5B is a pretty big clue that inference is not that expensive.

menaerus 18 minutes ago | parent [-]

Nonsense but I digress.