I believe they’re just classifying all models into “reasoning models” eg o3 vs “non reasoning models” eg 4o and just doing a comparison of total tokens (input tokens + hidden reasoning output tokens + shown output tokens)

▲

maikakz 11 hours ago | parent [-]

that's exactly right!

	▲	DIAexitNode 9 hours ago \| parent [-]
		hell yeah, 109 out of 10 doors opened! 99 bonus doors! what are you talking about, man?