Remix clone Hacker News

new | show | ask | jobs Github

	▲	Animats 7 days ago
		> Even with all this infra buildout all the hyperscalers are constantly capacity constrained, especially for GPUs. Are they constrained on resources for training, or resources for serving users using pre-trained LLMs? The first use case is R&D, the second is revenue. The ratio of hardware costs for those areas would be good to know.
	▲	keeda 7 days ago \| parent [-]
		Good question, I don't believe they break out their workloads into training versus inference, in fact they don't even break out any mumbers in any useful detail. But anecdotally the public clouds did seem to be most GPU-constrained whenever Sam Altman was making the rounds asking for trillions in infra for training. However, my understanding is that the same GPUs can be used for both training and inference (potentially in different configurations?) so there is a lot of elasticity there. That said, for the public clouds like Azure, AWS and GCP, training is also a source of revenue because other labs pay them to train their models. This is where accusations of funny money shell games come into play because these companies often themselves invest in those labs.