Remix.run Logo
greenmilk 3 hours ago

Are any inference providers currently making profit (on inference, I know google makes money)?

wsun19 2 hours ago | parent | next [-]

Pretty much every major American inference provider claims to make a profit on API-based inference. Consumer plans might be subsidized overall, but it's hard to say since they're a black box and some consumers don't fully use their plans

henry2023 41 minutes ago | parent | prev | next [-]

Third parties selling open-weight inference on OpenRouter are surely selling on a profit. Zero reason to subsidize it.

wavemode 2 hours ago | parent | prev | next [-]

Selling inference is not fundamentally different from selling compute - you amortize the lifetime cost of owning and operating the GPUs and then turn that into a per-token price. The risk of loss would be if there is low demand (and thus your facilities run underutilized), but I doubt inference providers are suffering from this.

Where the long-term payoff still seems speculative, is for companies doing training rather than just inference.

Gigachad 2 hours ago | parent [-]

There’s a lot of debate over what the useful lifespan of the hardware is though. A number that seems very vibes based determines if these datacenters are a good investment or disastrous.

hypercube33 7 minutes ago | parent [-]

I specifically remember this debate coming up when the H100 was the only player on the table and AMD came out with a card that was almost as fast in at least benchmarks but like half the cost. I haven't seen a follow up with real world use though and as a home labber I know that in the last three weeks the support for AMD stuff at least has gotten impressively useful covering even cuda if you enjoy pain and suffering.

What I'm curious about are what about the other stuff out there such as the ARM and tensor chips.

jagged-chisel 3 hours ago | parent | prev | next [-]

Google definitely makes money in other areas. Do they make money on inference?

3 hours ago | parent | prev [-]
[deleted]