Remix.run Logo
smca 13 hours ago

(I work at Anthropic) We have publicly stated[1] that our goal is to deploy Mythos-class models at scale when we have the requisite safeguards for offensive cyber risks in place. Mythos is a general frontier model, not a cyber-specific model so there are many reasons why we think our users will benefit from access (with the aforementioned safeguards in place) in due course. Compute has also not factored into our decision[2] to rollout the model in a limited fashion to defenders. We'll be sharing more soon on the first month or so of the project and rollout.

[1] https://www.anthropic.com/glasswing#:~:text=deploy%20Mythos%...

[2] https://x.com/logangraham/status/2054613618168082935

alt227 12 hours ago | parent | next [-]

Multiple people who have already used Mythos or been given its reports on their software have publicly stated that it's all hype, and that it is not really finding any new critical bugs which other models cant.

Lerc 11 hours ago | parent | next [-]

Do you have any good sources on that? I have seen things to suggest that not all of the hype is true, but so far I have not encountered anyone claiming all of the hype is untrue. Which is what I interpret "its all hype" (sic) to mean.

bluGill 11 hours ago | parent | next [-]

CURL has been scanned with multiple LLMs. Mythos was last and as a result found only 1 issue. If Myhos was really much better I'd expect it to find a lot more issues despite the others already there.

Also, the competing models are getting better. Opus 4.5 was better than everyone else when it was new, but only a few months later and there are a lot of models that are better (not just the newer Opus models)

lmc 10 hours ago | parent | next [-]

Curl had a prominent bug bounty programme, has 180k lines of prod code, and is mainly a client app/lib. I would look at other projects before making judgements about mythos on this one.

dogleash 7 hours ago | parent [-]

Don't you want to test mythos against state of the art projects? They are the best chance of making visible what mythos uniquely brings to the table.

We already know that mythos will be branded catnip for sub-SOTA projects. They could have build SOTA secure software development practices last week, last month or last year. But didn't care. What will their experience with mythos tell us other than AI hype can create corporate will to take security seriously?

lmc 6 hours ago | parent [-]

> Don't you want to test mythos against state of the art projects?

Yes, I'm just saying don't make judgements based on this single project alone.

preommr 11 hours ago | parent | prev [-]

Is the CURL thing mostly from the primagen video, or did it break into the greater social media sphere and I just missed it?

swingboy 10 hours ago | parent | next [-]

The cURL lead developer posted about it: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...

alt227 10 hours ago | parent | prev | next [-]

The Reg reported it:

https://www.theregister.com/security/2026/05/11/anthropics-b...

bluGill 10 hours ago | parent | prev [-]

I've been following him on mastodon and read it right there

alt227 11 hours ago | parent | prev [-]

For example, It was recently let loose on cURL and its maintainer is less than impressed:

https://www.theregister.com/security/2026/05/11/anthropics-b...

Lerc 10 hours ago | parent [-]

If you remove the fluff that the register added and stick with https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v... it seems like a claim that's a claim fairly distant to "its all hype". Less than expected perhaps? Maybe the code really is unexpectedly robust? I guess time will tell on that point.

alt227 9 hours ago | parent [-]

His direct quotes are:

> "My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing."

> "I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos."

> "An amazingly successful marketing stunt for sure."

Personally I see this as a very strong claim of hype. I take away from this that Mythos is a hyped up marketing stunt, and not what it was preesented to be by Anthropic at all.

tclancy 10 hours ago | parent | prev | next [-]

In this very thread we have a counter-example. What to think? https://news.ycombinator.com/item?id=48149519

alt227 8 hours ago | parent [-]

Read the replies to that comment, they are very valid in their cyncism.

The cURL creators report is the only real substance we have to go on. There are plenty of random comments both ways, but Im sure you and I would both agree that we shouldnt base our opinions on random internet comments and should wait for more official reports like the cURL one to make best judgement.

jsw97 10 hours ago | parent | prev [-]

Of course if it really is overhyped, then it becomes much more difficult to release it publicly. Better to retain the mystique and release the next thing. But we'll see eventually.

yanis_t 13 hours ago | parent | prev | next [-]

Are there any publicly verifiable sources that Mythos is that much more intelligent than Opus, so to be considered much more dangerous (as it is presented in the public discourse by Anthropic)

empath75 13 hours ago | parent [-]

It doesn't have to be _much more intelligent_ than Opus to be a risk. It doesn't even need to be _more intelligent_. It just needs to be _better at finding security problems_. Which could happen from just minor improvements in training data, or the harness, etc. Even a small improvement could shift it from finding very few new security holes, to reliably finding many at scale.

enraged_camel 11 hours ago | parent [-]

Yeah, I think a lot of the disconnect here is that people think of "model intelligence" as some sort of IQ score, rather than a combination of scores that measure abilities at a large variety of domains.

temac 12 hours ago | parent | prev | next [-]

Weird take to claim "generally intelligent frontier" (whatever rhat means) and restrict availability based on "offensive" cyber security alone (how can this be handled at all compared to fixing software also remain to be seen) all while competitors but more importantly sw maintainers (eg curl) estimate that the capability in finding cybersecurity bugs is similar to what other modern models produce, and this has just significatively risen in the last months for everybody.

grayhatter 11 hours ago | parent | prev | next [-]

are you able to detail a single safeguard you plan to implement, so that I can stop believing it's vaporware and/or a scam?

GenerWork 10 hours ago | parent [-]

How would it be vaporware? It's out in the wild and has been used by individuals/corporations.

grayhatter 8 hours ago | parent [-]

The security/safety controls they have to add to make it safe enough to release?

saidnooneever 11 hours ago | parent | prev [-]

no risk is added. all risk is already maxed out. release it.