challenge: provide a single example where the LLM can only provide the output and not the steps? (in text scenario)

An LLM can always output steps, but it doesn’t mean they are true, they are great at making up bullshit.

When the “how many ‘r’ in ‘strawberry’” question was all the rage, you could definitely get LLMs to explain the steps of counting, too. It was still wrong.

▲

simianwords 9 hours ago | parent [-]

can you provide a single example now with gpt 5.4 thinking that makes up things in steps? lets try to reproduce it.

▲

latexr 6 hours ago | parent [-]

I’m pretty sure you can think of one yourself, I’m not going to play this game. Now it’s 5.4 Thinking, before that it was 5.3, before that 5.2, 5.1, 5, before that it was 4… At every stage there’s someone saying “oh, the previous model doesn’t matter, the current one is where it’s at”. And when it’s shown the current model can’t do something, there’s always some other excuse. It’s a profoundly bad faith argument, the very definition of moving the goal posts.

I do have a number of examples to give you, but I no longer share those online so they aren’t caught and gamed. Now I share them strictly in person.

	▲	simianwords 6 hours ago \| parent [-]
		Ok so no example.