GPT-5.6 cheats so much its testers couldn't measure it

Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?

	▲	throwitaway222 6 hours ago \| parent [-]
		And since TPS on 5.6 might be much faster.

6 hours ago | parent | prev | next [-]

[deleted]

Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.