Tokens can be sold at profit, but 70% of compute expenditure goes to R&D and model training[0]. Inference needs to cover all of that as well as being profitable in a vacuum.

[0] https://epoch.ai/data-insights/openai-compute-spend

▲

ml_basics 4 hours ago | parent | next [-]

this will change as inference demand increases (which is happening right now faster than many people expected)

	▲	ainch 34 minutes ago \| parent \| next [-]
		At the same time, the training paradigm being scaled, Reinforcement Learning, is significantly less data-efficient than next-token prediction. You basically need to run an agent for minutes (or longer if you want good long-horizon performance), only to give it a binary pass/fail - one bit of information. Inference compute is definitely scaling fast, but to scale RL, training and R&D compute also needs to scale hard. I don't think it's obvious that inference will overtake R&D/training, unless there's a reputable source that states that.
	▲	vb-8448 3 hours ago \| parent \| prev [-]
		do you have some ref?

▲

benjiro3000 5 hours ago | parent | prev [-]

[dead]