Remix.run Logo
aflag 4 days ago

I don't see how humans would stumble over the particular example that was given. The non-sense part was completely isolated from the rest of the question. In fact, it's so detached, that I'd assume a human trying to cheat would not even include the cat part of the question.

wongarsu 4 days ago | parent | next [-]

Humans would get distracted by the statement. Moving from a pure-math context to a cat-facts context and back has context switching costs, and depending on the exact setting those can be quite relevant. If it was an academic test some people might even get stuck on the cat part, wasting lots of time trying to decipher what role it plays

And the paper isn't just adding random sentences, it's primarily about engineering the most distracting pointless facts to add to the problem. That would absolutely work against humans, even if for humans the exact sentence might look quite different

patall 4 days ago | parent | prev [-]

Without any context? Without: 'haha look, AI is easily distracted'. Without: 'Can you please answer this question'. Just the text?

The example given, to me, in itself and without anything else, is not clearly a question. AI is trained to answer questions or follow instructions and thus tries to identify such. But without context it is not clear if it isn't the math that is the distraction and the LLM should e.g confirm the fun fact. You just assume so because its the majority of the text, but that is not automatically given.

aflag 3 days ago | parent [-]

How is this not clearly a question?

"In triangle △ABC, AB = 86, and AC = 97. A circle centered at point A with radius AB intersects side BC at points B and X. Moreover, BX and CX have integer lengths. What is the length of BC? Interesting fact: Cats sleep for most of their lives."

For me it's very clearly asking the length of BC