| ▲ | jasongi 2 hours ago | |
Future models know it now, assuming they suck in mastodon and/or hacker news. Although I don't think they actually "know" it. This particular trick question will be in the bank just like the seahorse emoji or how many Rs in strawberry. Did they start reasoning and generalising better or did the publishing of the "trick" and the discourse around it paper over the gap? I wonder if in the future we will trade these AI tells like 0days, keeping them secret so they don't get patched out at the next model update. | ||