What I haven't seen discussed anywhere so far is how big a lead Anthropic seems to have in intelligence per output token, e.g. if you look at [1].

We already know that intelligence scales with the log of tokens used for reasoning, but Anthropic seems to have much more powerful non-reasoning models than its competitors.

I read somewhere that they have a policy of not advancing capabilities too much, so could it be that they are sandbagging and releasing models with artificially capped reasoning to be at a similar level to their competitors?

How do you read this?

[1] https://imgur.com/a/EwW9H6q

▲

phamilton 6 hours ago | parent [-]

Intelligence per token doesn't seem quite right to me.

Intelligence per <consumable> feels closer. Per dollar, or per second, or per watt.

	▲	mnicky 5 hours ago \| parent [-]
		It is possible to think of tokens as some proxy for thinking space. At least reasoning tokens work like this. Dollar/watt are not public and time has confounders like hardware.