Remix.run Logo
kaonashi-tyc-01 5 hours ago

Yep, I read this blog. What confuses me is that Anthropic doesn't seem to be bothered by this study and keeps publishing Verified results.

That is what gets me curious in the first place. The fact Mythos scored so high, IMO, exposes some issues with this model: it is able to solve seemingly impossible to solve problems.

Without cheating allegation, which I don't think ANT is doing, it has to be doing some fortune telling/future reading to score that high at all.