Remix.run Logo
SamBam a day ago

This is the first LLM that has been able to answer my logic puzzle on the first try without several minutes of extended reasoning.

> A man wants to cross a river, and he has a cabbage, a goat, a wolf and a lion. If he leaves the goat alone with the cabbage, the goat will eat it. If he leaves the wolf with the goat, the wolf will eat it. And if he leaves the lion with either the wolf or the goat, the lion will eat them. How can he cross the river?

Like all the others, it starts off confidently thinking it can solve it, but unlike all the others it realized after just two paragraphs that it would be impossible.

mellow_observer a day ago | parent | next [-]

Actual answer: He crosses the river and takes all of the animals and the cabbage with him in one go. why not?

Alternative Answer: He just crosses the river. Why would he care who eats what?

Another Alternative Answer: He actually can't cross the river since he doesn't have a boat and neither the cabbage nor the animals serve as appropriate floatation aids

SamBam a day ago | parent | next [-]

Yup, for your first one, no LLM has ever realized that I don't actually specify that he can only take one thing at a time. I think that's natural that it would assume that (as would most humans) because it would be so heavily primed to fill that in from every other version of the puzzle it's seen.

speed_spread 13 hours ago | parent | prev [-]

I will trust LLMs the day they suggest making a raft out of the lion's skin and propel it across the river by eating the cabbage and farting away. Goats are cool, keep the goat.

jermberj 9 hours ago | parent | prev | next [-]

Answer from Claude 4 Sonnet just now (I copy-pasted your prompt):

> This is a variation of the classic river-crossing puzzle, but with an added complexity - the lion that will eat both the wolf and the goat. Let me work through this step by step: The constraints:

> Goat + Cabbage alone = Goat eats cabbage > Wolf + Goat alone = Wolf eats goat > Lion + Wolf alone = Lion eats wolf > Lion + Goat alone = Lion eats goat

> The solution:

> First trip: Man takes the goat across Left behind: Cabbage, Wolf, Lion (safe - wolf won't eat cabbage, lion won't eat wolf if goat isn't there)

> Second trip: Man returns alone, then takes the wolf across Brings the goat back to prevent wolf from eating it Left behind: Cabbage, Goat, Lion

> Third trip: Man takes the lion across, leaves it with the wolf Left behind: Cabbage, Goat (safe - goat won't eat cabbage if separated properly)

> Fourth trip: Man returns alone, takes the cabbage across Left behind: Goat (alone and safe)

> Fifth trip: Man returns alone, takes the goat across Wait, let me reconsider this more carefully. Actually, there's a fundamental problem: if the lion will eat either the wolf OR the goat when left alone with them, and we need to get all items across, we can never safely leave the lion with either animal.

> The puzzle as stated appears to have no solution, because the lion creates an impossible constraint - it cannot be left alone with either the wolf or the goat on either side of the river, making it impossible to transport all items safely.

It got it right, eventually which is impressive. However, it's ability to retain proper context is still a problem (it took the goat on the first trip, then thinks the goat is still on the same side of the river as the other things.

ungreased0675 a day ago | parent | prev | next [-]

The answer isn’t for him to get in a boat and go across? You didn’t say all the other things he has with him need to cross. “How can he cross the river?”

Or were you simplifying the scenario provided to the LLM?

IAmGraydon a day ago | parent | prev | next [-]

Now that you've posted this online, you can consider it unusable as a test. If you've ever posted it online before now, you can consider the results null and void.

Beyond that, as others have mentioned, this is not actually a logic puzzle at all, as there are multiple correct answers.

davidanekstein a day ago | parent | prev | next [-]

o4-mini-high got it on my first try after 9 seconds

ttoinou a day ago | parent | prev [-]

That is a classic riddle and could easily be part of the training data. Maybe if you change the wording of the logic, then use different names, and change language to a less trained on language than english, it would be meaningful to see if it found the answer using logic rather than pattern recognition

KolmogorovComp a day ago | parent [-]

Had you paid more attention, you would have realised it's not the classic riddle, but an already tweaked version that makes it impossible to solve, hence why it is interesting.

albumen a day ago | parent | next [-]

Mellowobserver above offers three valid answers, unless your puzzle also clarified that he wants to get all the items/animals across to the other side alive/intact.

SamBam a day ago | parent | next [-]

Indeed, but, no LLM has ever realized that I don't actually specify that he can only take one thing at a time. It's natural that it would assume that (as would most humans) because it would be so heavily primed to fill that in from every other version of the puzzle it's seen.

I'd give them full credit if they noticed that, but I was also wanting to see if, given the unstated assumptions (one thing in the boat, don't let anything eat anything else, etc) they'd realize it was unsolvable.

soledades 20 hours ago | parent | next [-]

most humans would not assume that since most humans are not heavily primed by logic puzzles.

cdelsolar 11 hours ago | parent | prev [-]

Why is it unsolvable? I am confused.

beepbooptheory a day ago | parent | prev [-]

All those answers recognize that its a trick though!

felipeerias 19 hours ago | parent | prev | next [-]

Both Claude 4 Sonnet and Opus fail this one, even with extended thinking enabled, and even with a follow-up request to double-check their answers:

“What is heavier, 20 pounds of lead or 20 feathers?”

cdelsolar 11 hours ago | parent [-]

chatgpt (whatever fast model they use) passed that after i told it to "read my question again"

ttoinou a day ago | parent | prev [-]

Ah right. But maybe someone thought about this simple trick / change already too.