Remix clone Hacker News

new | show | ask | jobs Github

	▲	drakythe a day ago
		Anthropic also recently tweaked their usage limits to discourage use during peak hours. Why would they do that if inference was profitable?
	▲	infecto a day ago \| parent \| next [-]
		Don’t confuse inference (api usage) with the consumer plan products. When people say inference is profitable they are referring to the cost to serve a token via the API. The consumer products are absolutely a question mark on profitability and as we see with most of the business and enterprise plans, going away for pure on demand use (api cost) full time.
	▲	strangegecko a day ago \| parent \| prev \| next [-]
		Profitability doesn't imply infinite ability to scale. Of course they will want to prioritize their most profitable customers when they hit capacity issues.
	▲	aurareturn a day ago \| parent \| prev \| next [-]
		They do it because their demand is higher than the compute that they have available to them. Their GPUs must be melting during peak hours so they're encouraging people who move their workload to off peak hours if possible. This is the opposite of an AI bubble burst.
	▲	paulddraper a day ago \| parent \| prev \| next [-]
		Those are subscription plans. They tweaked the limits/periods included in the subscription. Having higher limits for subscription plans didn't give them any more revenue.
	▲	financltravsty a day ago \| parent \| prev [-]
		Their infra team is very understaffed and they are reacting to the public backlash of "no 9s?"