Remix.run Logo
ralusek 2 hours ago

We used to not know, but now because open source models are being hosted and served by people whose only incentive is making profit on directly running inference, we have a ballpark idea.

sarchertech an hour ago | parent | next [-]

No we have no idea that the open source inference market isn’t being kept artificially low because some of the operators are operating a loss hoping to gain market share. All it takes is a few and everyone else has to lower prices to compete while they hope for lower costs and subsidies to dry up.

We also have to assume that these operators are correctly pricing GPU depreciation, and the market is so new there is no reason to believe they are.

alex_sf 2 hours ago | parent | prev [-]

There's no reason to think that the latest frontier models have similar inference costs to open source models.

It would be more surprising if the surrounding architecture hasn't significantly diverged. If it _hasn't_ significantly diverged, then given the performance difference it would imply that the frontier models have significantly greater param counts, which would result in a higher cost.