Remix clone Hacker News

new | show | ask | jobs Github

	▲	kcb 10 hours ago
		Inference isn't really that expensive, its the training of new foundational models that is. With whatever highly optimized setup the big providers are using, they should be able to pack quite a lot of concurrent users onto a deployment of a model. Just think too, it's very possible their use case would be served just fine by a 100B model deployed to a $4,000 DGX Spark.