Remix.run Logo
rzmmm 4 hours ago

Quote:

"My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing."

It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.

63stack 2 minutes ago | parent | next [-]

I'm pretty sure mythos is just a new unreleased version of Opus + marketing + a different system prompt.

therealpygon 2 hours ago | parent | prev | next [-]

Anthropic using marketing to convince people their models are more advanced, better built, or that AI is a threat that needs to be regulated because only they have the answer? I’m shocked.

More seriously, so far I haven’t seen much indication that Mythos is more than Opus with a security focused code analysis harness. That said, the fact it can find these bugs in an automated fashion is the more important takeaway outside of the hype.

I’m curious what the error rate is on the detections, because none of that means much if it is wrong 90% of the time and we are only hearing about the examples that are useful marketing.

johnbarron 2 hours ago | parent [-]

>> Anthropic using marketing to convince people their models are more advanced, better built, or that AI is a threat that needs to be regulated because only they have the answer? I’m shocked.

I remember when OpenAI was saying GPT-2 was too dangerous to release.

stingraycharles an hour ago | parent | next [-]

I remember when there was a guy at Google years a few years ago that was convinced that they had an internal, sentient creature in their labs (I think maybe 4 years ago?)

If I’m not mistaken, after the media cycle, he lost his job for breaking confidentiality.

That was the opposite of marketing, Google really didn’t get how to turn this into a product until ChatGPT happened.

player1234 36 minutes ago | parent [-]

[dead]

44 minutes ago | parent | prev | next [-]
[deleted]
2ndorderthought an hour ago | parent | prev [-]

"it can almost like write 2 paragraphs!" "It might be conscious" "this is basically AGI, we had to fire someone who spilled the beans"

etiam 20 minutes ago | parent [-]

I always thought he was fired for making crackpot statements to the press in reference to his professional capacity, and thus creating bad PR and embarrassing spectacle for his employer. Seems like legitimate reasons to me.

ZeroGravitas 15 minutes ago | parent [-]

An interesting question now is whether he had standard mental health issues, or if he was an early example of AI psychosis or whatever we call people who are falling in love with their AI chatbots because they tell them how smart they are.

vidarh 4 hours ago | parent | prev | next [-]

It may well be that the hype was primarily marketing.

The other alternative is that Curl is simply secure enough that there was far less to find than in other projects.

teiferer 2 hours ago | parent [-]

Given how much money is on the line, it would be gross negligence if anything came publicly out of the CEO's mouth or is otherwise published by the company that's not marketing.

thombles 2 hours ago | parent | prev | next [-]

Curl simply isn't a good data point. It's one of the most picked-over codebases in existence with extensive security testing practices. All the researchers using not-quite-Mythos models have had plenty of time to report bugs up to this point. Daniel may be right that Mythos hasn't been a game changer for curl but the preconditions are different for virtually any other codebase. Perhaps the real marketing here is his own modesty about curl's maturity.

GuB-42 an hour ago | parent | next [-]

To me, it is a very good data point.

Curl uses all sorts of tools, including AI tools to find bugs. These tools, according to the article found hundreds of bugs including a dozen CVE.

Mythos found one vulnerability. It means the Mythos is just another tool, not the revolution it claims to be.

It is common that when a new tool is introduced that a bunch of bugs are found, with diminishing returns. Mythos finding one vulnerability is consistent to what I would expect for a major update to an existing tool, which Mythos is over existing LLM-based solutions.

thombles an hour ago | parent [-]

The question is how many security vulnerabilities are actually left in the code after all the recent AI attention. Either Mythos is a nothingburger, or it's substantially more powerful but there's nothing left to do. Even a large amount of C can be correct eventually. Curl has the _potential_ to become a good data point maybe 6-12 months from now - if researchers and new tools find many more vulnerabilities then Mythos is proved to be hype. If they don't, then maybe Mythos is overkill for today's curl and its capabilities are better deployed elsewhere (like Firefox, apparently).

GuB-42 18 minutes ago | parent [-]

I have a hard time believing that Mythos found the only remaining Curl vulnerability. It is possible, but highly improbable.

And it is not overkill, the proof is that it found that vulnerability. It is like saying the new version of some static analyzer with some new rules is "overkill" because it only found only one more bug than the previous version. Deciding whether it is overkill or not is more about context. Using a very expensive model like Mythos for some little used non-critical software is overkill, but for Curl, it absolutely isn't.

If Mythos found loads of vulnerabilities in Firefox but not in Curl, I wouldn't say that's because of Mythos is so good, but rather that with the release of Mythos, they did some testing that could have been done before using the same tools Curl have used.

thombles 7 minutes ago | parent [-]

We will see. As for "testing that could have been done before", Mozilla's posts indicate otherwise. Use of Opus 4.6 led to 22 security-sensitive bugs vs Mythos' 271 (https://blog.mozilla.org/en/privacy-security/ai-security-zer...). They already had the methodology in place when the more powerful model came along (https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...):

> Once the end-to-end pipeline is in place, it’s trivial to swap in different models when they become available. Building this pipeline early helped us find a number of serious bugs using publicly-available models, and it also helped us hit the ground running when we had the opportunity to evaluate Claude Mythos Preview. In our experience, model upgrades increase the effectiveness of the entire pipeline: the system gets simultaneously better at finding potential bugs, creating proof-of-concept test cases to demonstrate them, and articulating their pathology and impact.

spongebobstoes an hour ago | parent | prev | next [-]

that makes it a good data point, because it is better able to illustrate the incremental capabilities of Mythos compared to previous tooling

that helps us to understand how much of Mythos is hype and how much is real

20k an hour ago | parent | prev [-]

We see this exact hypetrain every time a new model is released. Mythos simply hasn't lived up to the "we're all gunna die from the flood of vulnerabilities" hype even slightly. Its slightly better than previous models by all accounts, cool stuff

I've seen literally near word-for-word this exact chain of events multiple times previously

bigcat12345678 2 hours ago | parent | prev | next [-]

My guess:

Marketing is not intentional.

Evidences: 10 years ago, when I interviewed Baidu AI with Andrew Ng and Dario, Dario is the kind of person is pure-hearted to the point being ideological. Given Dario's successful career so far, that essence has gradually grown into a conviction, and surrounded by a purposely built team which amplifies his ideology.

Humans are very convenient creature, a rare few small fraction of them are no doubt the master of convenience: they morph their mental manifold without a hint of contradiction in their own mental mechanisms.

stingraycharles 2 hours ago | parent | next [-]

These things are layered. They are great scientists, smart people, etc.

Things change when you’re running a business like Anthropic, especially as the CEO. You have a responsibility to shareholders, and you just need to play the game.

Anthropic chose a great angle: focus on professionals / enterprise, safety, etc. Those can both be done by a genuine desire to make great technology, and for business purposes require you to position yourself in a bit “better” way than reality.

Just look at what their strategy is with Mythos, it’s almost perfection: the “it’s not ready to be released to the public” angle hits all the marks: they care about responsibility / safety, they have “the best” model, and “LLMs are dangerous, but we, as the guardians, can be trusted”. This also helps the industry as a whole with regulation: if they’re being constrained, China will develop even more dangerous models.

This is a result of how smart people treat business, it’s PR perfection, especially given how much the whole industry is talking about it.

(Yes, they fail in other PR areas, but that’s a different discussion)

OtherShrezzing 2 hours ago | parent | prev | next [-]

I'm not sure if that distinction is important, since what you've described less charitably synonymous with the phrase "Dario is delusional, and has surrounded himself with yes-men, so outlandish marketing gets published as a side effect".

Whether the person doing the marketing was sincere about it or not is immaterial, since marketing is experienced almost entirely by the people consuming it, and not the people communicating it. What matters is if the audience is sincerely concerned by the message, and it's transparently the case that they were sincerely concerned by it.

teiferer 2 hours ago | parent | prev [-]

> Marketing is not intentional.

That's an odd definition of "intentional". Evolution has filtered for people with certain views and the marketing has just emerged from their actions. ... So?

A deadly virus (naturally occurring one let's say) wasn't created intentionally. Evolution selected for it. It's still bad and kills people. Doesn't make it nice because of lack of intention.

h1fra 3 hours ago | parent | prev | next [-]

They might be biased by the fact that curl is significantly more secure than the average software

jansan 3 hours ago | parent | prev | next [-]

Mythos marketing really leans into that "too powerful to be legal" vibe, much like how PS2s were allegedly banned from North Korea because their chips were basically missile-grade.

coldtea 4 hours ago | parent | prev | next [-]

>It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.

About as subtle as a personal injury lawyer's billboard

steve1977 4 hours ago | parent | next [-]

Better Call Dario

te_chris 4 hours ago | parent | prev [-]

A thankfully American reference

Exoristos 3 hours ago | parent [-]

Can you expand on this? Do you mean in contrast to the European AI milieu?

te_chris 2 hours ago | parent [-]

No, the personal injury lawyer billboards.

an hour ago | parent | prev | next [-]
[deleted]
greendude29 4 hours ago | parent | prev | next [-]

I'd go out and say the marketing is not subtle. The hype and fanboys/girls are so in line with the marketing that any level of skepticism is seen a an act of defection, but if you look at the words, hyperbole and volume that is used, there is nothing subtle about it.

It's almost Trump-esque - "this model will change everything forever; we are doomed; we are saved; we will all be fired; we will all be rich", etc

xantronix 4 hours ago | parent | next [-]

That's a pretty good encapsulation of the parallels between the political and the technological: One necessarily thrives upon the other and are inextricable. This moment is a culmination of all the disenfranchisement the bodypolitik have suffered, looking for any possible means of escape or elevation. AI and Trumpism, for their own respective cohorts, are salvation, on offer by different frontmen but ultimately in service of the same system.

They need the hype to pay off way more than we do. So many of us who still write code directly stand to lose nothing of our capabilities if the marketing claims cannot hold water.

ehnto 3 hours ago | parent | prev [-]

I seem to be totally outside the hype bubble, but I have to suspect there is a lot of imagineering and wild extrapolations in the elss technical hype bubbles. I am curious but no enough to go looking.

tonyedgecombe 3 hours ago | parent [-]

>I seem to be totally outside the hype bubble

I'm surprised you say that because it is all over Hacker News. Every single post is co-opted into promoting AI. Try finding a submission with fifty points or more than doesn't have AI or LLM's mentioned somewhere in the comments.

zen928 2 hours ago | parent [-]

Feel free to retire from the field if you grow tired of seeing its latest developments.

tonyedgecombe 44 minutes ago | parent [-]

I already have.

That’s not really the point though. I have no doubt AI is useful, I just don’t want to have it shoved in my face every five minutes.

aaron695 2 hours ago | parent | prev [-]

[dead]