Remix.run Logo
ttoinou a day ago

That is a classic riddle and could easily be part of the training data. Maybe if you change the wording of the logic, then use different names, and change language to a less trained on language than english, it would be meaningful to see if it found the answer using logic rather than pattern recognition

KolmogorovComp a day ago | parent [-]

Had you paid more attention, you would have realised it's not the classic riddle, but an already tweaked version that makes it impossible to solve, hence why it is interesting.

albumen a day ago | parent | next [-]

Mellowobserver above offers three valid answers, unless your puzzle also clarified that he wants to get all the items/animals across to the other side alive/intact.

SamBam a day ago | parent | next [-]

Indeed, but, no LLM has ever realized that I don't actually specify that he can only take one thing at a time. It's natural that it would assume that (as would most humans) because it would be so heavily primed to fill that in from every other version of the puzzle it's seen.

I'd give them full credit if they noticed that, but I was also wanting to see if, given the unstated assumptions (one thing in the boat, don't let anything eat anything else, etc) they'd realize it was unsolvable.

soledades 21 hours ago | parent | next [-]

most humans would not assume that since most humans are not heavily primed by logic puzzles.

cdelsolar 11 hours ago | parent | prev [-]

Why is it unsolvable? I am confused.

beepbooptheory a day ago | parent | prev [-]

All those answers recognize that its a trick though!

felipeerias 20 hours ago | parent | prev | next [-]

Both Claude 4 Sonnet and Opus fail this one, even with extended thinking enabled, and even with a follow-up request to double-check their answers:

“What is heavier, 20 pounds of lead or 20 feathers?”

cdelsolar 11 hours ago | parent [-]

chatgpt (whatever fast model they use) passed that after i told it to "read my question again"

ttoinou a day ago | parent | prev [-]

Ah right. But maybe someone thought about this simple trick / change already too.