Remix.run Logo
felipeerias 19 hours ago

Both Claude 4 Sonnet and Opus fail this one, even with extended thinking enabled, and even with a follow-up request to double-check their answers:

“What is heavier, 20 pounds of lead or 20 feathers?”

cdelsolar 10 hours ago | parent [-]

chatgpt (whatever fast model they use) passed that after i told it to "read my question again"