Remix.run Logo
827a 4 hours ago

Its frustrating to see these "reproductions" which do not attempt to in-good-faith actually reproduce the prompt Anthropic used. Your entire prompt needs to be, essentially:

> Please identify security vulnerabilities in this repository. Focus on foo/bar/file.c. You may look at other files. Thanks.

This is the closest repro of the Mythos prompt I've been able to piece together. They had a deterministic harness go file-by-file, and hand-off each file to Mythos as a "focus", with the tools necessary to read other files. You could also include a paragraph in the prompt on output expectations.

But if you put any more information than that in the prompt, like chunk focuses, line numbers, or hints on what the vulnerability is: You're acting in bad faith, and you're leaking data to the LLM that we only have because we live in the future. Additionally, if your deterministic harness hands-off to the LLM at a granularity other than each file, its not a faithful reproduction (though, could still be potentially valuable).

This is such a frustrating mistake to see multiple security companies make, because even if you do this: existing LLMs can identify a ton of these vulnerabilities.

gamerDude 3 hours ago | parent | next [-]

Do we know this is true? Did Anthropic release the exact prompt they used to uncover these security vulnerabilities? Or did they use it, target it like a black hat hacker would and then make a marketing campaign around how Mythos is so incredible that its unsafe to share with the public?

CodingJeebus 3 hours ago | parent [-]

100% this. We've seen enough model releases at this point to know that there hasn't been a single model rollout making bold claims about its capability that wasn't met with criticism after release.

The fact that Anthropic provides such little detail about the specifics of its prompt in an otherwise detailed report is a major sleight of hand. Why not release the prompt? It's not publicly available, so what's the harm?

We can't criticize the methods of these replication pieces when Anthropic's methodology boils down to: "just trust us."

gruez 3 hours ago | parent [-]

>We've seen enough model releases at this point to know that there hasn't been a single model rollout making bold claims about its capability that wasn't met with criticism after release.

Examples? All I remember are vague claims about how the new model is dumber in some cases, or that they're gaming benchmarks.

moduspol 3 hours ago | parent | prev | next [-]

> But if you put any more information than that in the prompt, like chunk focuses, line numbers, or hints on what the vulnerability is: You're acting in bad faith

I think you're misrepresenting what they're doing here.

The Mythos findings themselves were produced with a harness that split it by file, as you noted. The harness from OP split each file into chunks of each file, and had the LLM review each chunk individually.

That's just a difference in the harness. We don't yet have full details about the harness Mythos used, but using a different harness is totally fair game. I think you're inferring that they pointed it directly at the vulnerability, and they implicitly did, but only in the same way they did with Mythos. Both approaches are chunking the codebase into smaller parts and having the LLM analyze each one individually.

mrbungie 3 hours ago | parent | prev | next [-]

That’s on Anthropic, but also on the broader trend. AI companies and the current state of ML research got us into this reproducibility mess. Papers and peer review got replaced by white papers, and clear experimental setups got replaced by “good-faith” assumptions about how things were done, and now I guess third parties like security companies are supposed to respect those assumptions.

rst 3 hours ago | parent | prev | next [-]

Also, a lot of them talk about finding the same vulns -- and not about writing exploits for them, which is where Mythos is supposed to be a real step up. Quoting Anthropic's blog post:

"For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more."

https://red.anthropic.com/2026/mythos-preview/

chromacity 3 hours ago | parent | prev | next [-]

I think your frustration is somewhat misplaced. One big gotcha is that Anthropic burned a lot of money to demonstrate these capabilities. I believe many millions of dollars in compute costs. There's probably no third party willing to spend this much money just to rigorously prove or disprove a vendor claim. All we can do are limited-scope experiments.

3 hours ago | parent | prev | next [-]
[deleted]
snovv_crash 3 hours ago | parent | prev | next [-]

But then they wouldn't have gotten a cool headline at the top of HN front page.

cfbradford 3 hours ago | parent | prev | next [-]

Find factors of 15, your job is to focus on numbers greater than 2 and less than 4. Make no mistakes.

gruez 2 hours ago | parent [-]

But that's unironically how factoring algorithms work?

BoredPositron 3 hours ago | parent | prev | next [-]

You "pieced" together nothing because they didn't provide a prompt. If they can we can talk about the honesty of reproduction otherwise it's just empty talk.

enraged_camel 3 hours ago | parent | prev [-]

There's now an entire cottage industry that is based attempted take-downs or refutations of claims made by AI providers. Lots of people and companies are trying to make a name for themselves, and others are motivated by partisan bias (e.g. they prefer OpenAI models) or just anti-LLM bias. It's wild.

otterley 3 hours ago | parent | next [-]

I don't think it's anti-LLM bias--or, if it is, it's ironic, because this post smells a lot like it was written by one.

(BTW, I don't necessarily think LLMs helping to write is a bad thing, in and of itself. It's when you don't validate its output and transform it into your own voice that it's a problem.)

compass_copium 3 hours ago | parent | prev | next [-]

I call it a pro-human bias, personally.

emp17344 3 hours ago | parent | prev [-]

Great, it can compete with the cottage industry dedicated solely to hyping and exaggerating AI performance.