Remix clone Hacker News

new | show | ask | jobs Github

	▲	littlestymaar 6 hours ago
		I guess it mostly comes from using the model with batch-size = 1 locally, vs high batch size in a DC, since GPU consumption don't grow that much with batch size. Note that while a local chatbot user will mostly be using batch-size = 1, it's not going to be true if they are running an agentic framework, so the gap is going to narrow or even reverse.
	▲	eru an hour ago \| parent [-]
		Well, different parts of the world also have different electricity prices.