| ▲ | saithound 13 hours ago |
| It's pretty clear at this point that Mythos' capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like the ones available to OpenAI's Plus/Pro subscribers. Anthropic tries to create marketing hype around Mythos using two psychological tricks. 1. Put large numbers in the headlines. "Mythos discovered 271 vulnerabilities in Firefox" makes the model seem extremely capable to the uninitiated. But it's actually meaningless as a measure of capability _improvement_. Anthropic gave away $100mil specifically as Mythos credits to these projects and companies (that's $2.5mil per project). Spending the same exorbitant amount of compute analyzing the same codebases in an older model like GPT 5.x Pro would have turned up 260 of these vulnerabilities, or could even have turned up more than 271 ones. No need to speculate, since this is exactly what we saw in the few code bases where we have such comparisons (like in the curl codebase). Supposedly weaker models, working with a much lower budget, turned up dozens of vulnerabilities. Mythos turned up only one, which ended up as a low severity CVE. 2. Do the whole "too dangerous to release" shtick. This is one of Dario Amodei's favorite moves. When he was vice president of research at OpenAI, he declared GPT-3 (which wasn't able to produce coherent text beyond 3-4 sentences at the time) too dangerous [1] as well. Long story short, it's the ChatGPT 4.5 situation again: a company trained a model that's too slow and expensive, but not much more capable than what came before. It therefore requires these marketing stunts. [1] https://www.itpro.com/technology/artificial-intelligence-ai/... |
|
| ▲ | IX-103 11 hours ago | parent | next [-] |
| I work for a company that has been using Mythos for vulnerability detection in our software. The results we're getting are revolutionary to the point that our software security teams are heavily overloaded addressing the deluge of thousands of real bugs/vulnerabilities and design flaws across our billions of lines of code. For comparison, we are invested heavily the the AI space to the point where Anthropic is one of our competitors. We were already using state of the art models to find flaws in our code, but Mythos was just so much better at finding real vulnerabilities it's not even funny. |
| |
| ▲ | zelda420 9 hours ago | parent | next [-] | | Yeah I’m a security researcher and my colleagues who have access say it’s insanely good… but interestingly they also work for places like nvidia which have a deep vested interest selling tokens and hardware. So of course they are pushing this narrative. | |
| ▲ | thrawa8387336 11 hours ago | parent | prev | next [-] | | Read the above comment again. Both your comments and his/hers are compatible | | |
| ▲ | anon84873628 10 hours ago | parent [-] | | They are directly contradicting the claim that if you ran other models on the same codebases you would get similar results. |
| |
| ▲ | The_Blade 11 hours ago | parent | prev | next [-] | | if you are invested heavily in the AI space, isn't it in your best interest for the froth around Mythos to be true and the comment you are responding to to be invalid? even if you are competing with Anthropic, a rising tide raises all ships i'd like to see more facts and data one way or another! | | |
| ▲ | anon84873628 10 hours ago | parent [-] | | This is the "circumstantial" version of the ad hominem fallacy. Just because the author may benefit from the argument being true doesn't mean it is invalid. They are clearly disputing the assertion the Mythos is an incremental gain rather than quantum leap. Of course objective unbiased data would be nice, but these anecdotes are all we have right now. |
| |
| ▲ | bob1029 11 hours ago | parent | prev [-] | | > billions of lines of code. Billions as in 10^9? | | |
|
|
| ▲ | jcims 11 hours ago | parent | prev | next [-] |
| >Do the whole "too dangerous to release" shtick. One aspect that isn't really discussed much in this context is how to wrap one's head around the corporate risk with models of ever increasing capability. It might not be too dangerous to society, but it could be too dangerous to Anthropic. |
| |
|
| ▲ | kilroy123 12 hours ago | parent | prev | next [-] |
| I couldn't agree more. I think the recent moves to partner with xAI and Amazon are proof that they desperately need more compute and are doing everything possible to get it. |
| |
| ▲ | MattRix 11 hours ago | parent [-] | | I mean everyone knows they need more compute. That’s not a secret or up for debate at all. They are maybe the fastest growing company in history. |
|
|
| ▲ | fwipsy 12 hours ago | parent | prev | next [-] |
| I'm fairly certain Amodei believes the "too dangerous to release" hype himself. Even if it's just an incremental improvement, better than getting frog-boiled by repeated 20% improvements until someone builds bioweapons in their backyard. |
| |
| ▲ | drakythe 12 hours ago | parent | next [-] | | He's made so many statements that fall under the "boy who cried wolf" category that even if he _does_ believe these statements he needs to be managed better. I'll never forget Anthropic's huge "Oh my God, the AI blackmailed a researcher to save itself!" and the prompt effectively told the AI to do that and gave it forged emails with easy blackmail targets, as if this isn't a common trope in mystery or suspense books/television/fanfiction, all of which Claude (and others) have been trained on. | | |
| ▲ | ctoth 11 hours ago | parent | next [-] | | It's a common trope, all through the training data, and all the modern AIs have read it, and would probably act similarly? Is that what we should take away from your comment?
so we have nothing to worry about. Makes sense. Really, it's just a common trope. | | |
| ▲ | fwipsy 10 hours ago | parent [-] | | Oh of course wolves have sharp teeth, they're predators. Anyone know knows this can never be bitten. |
| |
| ▲ | fwipsy 10 hours ago | parent | prev [-] | | Imagine you're in a car and the car is driving towards a cliff. You shout at the driver "oh my god we're about to go over a cliff!" And he says "you said that two seconds ago, but we're still alive, you're just like the boy who cried wolf. Do you know exactly when we're going to go over a cliff? No? Maybe you're imagining the cliff." I think it's very improbable that AI is as dangerous as Yud et al fear it is. But it's too soon to say and there seems to be significant long-tail risk. Mocking or criticizing people for being concerned about that risk seems counterproductive. Seems like the life cycle of huge tech companies like meta, Google, Microsoft, Amazon is "do whatever's necessary to take over the world, then enshittify." I don't take it for granted that Amodei and Anthropic seem to not quite be maximally power hungry? Re: second half of your comment. Understanding a threat doesn't neutralize it. Anthropic didn't make that big a deal of it either; it was news articles that blew it out of proportion. |
| |
| ▲ | moralestapia 11 hours ago | parent | prev [-] | | * sigh * Three things: * Delaying the release accomplishes nothing. * The barrier to someone building/not-building a bioweapon in their backyard is not access to an LLM. * Remember when GPT 3.5 was going to destroy the world? And how it was conscious? And how it was "trying to escape"? Lmao. | | |
| ▲ | malfist 11 hours ago | parent | next [-] | | I think gpt 3.5 might have destroyed the world | |
| ▲ | usaar333 11 hours ago | parent | prev | next [-] | | How does delaying the release not solve anything? It puts everyone on a notice to fix all security vulnerabilities now | | |
| ▲ | spooneybarger 11 hours ago | parent [-] | | Because the only thing keeping those vulnerabilities in existence was laziness. | | |
| ▲ | anon84873628 10 hours ago | parent [-] | | "laziness" is an interesting reframing of "rational cost-benefit analysis and the limits of the human mind". |
|
| |
| ▲ | fwipsy 10 hours ago | parent | prev [-] | | You're right, it's silly for me to worry. We've never had a technology that initially appeared benign but turned into a big problem. In fact, no tech company has ever released technologies that cause problems for the rest of society AT ALL. /s What are the other barriers? Last I checked access to CRISPR is not especially tightly regulated. Even if it is, defense in depth is a thing. | | |
| ▲ | moralestapia 10 hours ago | parent [-] | | If it was as easy as "knowing how to" someone would've already done it or at least attempted to.* Plenty of people know how to, 10,000s of researchers, perhaps you know someone who does. Did you know that your local veterinary shop has enough drugs to kill 100s of people? Why doesn't it happen? * It's not that easy. * There's a ton of regulation that is hard to circumvent, on purpose. * There's a gigantic deterrent called "spend the rest of your life behind bars" that people tend to avoid. An LLM, even the most advanced one, does not make any material change in any of these. You cannot bullshit your way into "uhh, I need Ebola samples for ... reasons". Unironically, your Sunday movie portraying a super-villain jeopardizing a city with his "home lab" full of flasks with colored liquids and BioHazard signs push way more people into becoming interested on this than having access to an LLM. *: Okay, like 5 people, and way before LLMs were a thing. This has been a thing for decades, we're fine. | | |
| ▲ | fwipsy 4 hours ago | parent [-] | | CRISPR has not been a thing for decades. Biotechnology is advancing and AI is lowering the bar to use it. In 2018 a PhD student was able to synthesize an infectious horsepox virus: https://journals.plos.org/plosone/article?id=10.1371/journal... So far the overlap between people with bioengineering capabilities and murderous tendencies has been very low. As the technology becomes available to more people that overlap may increase. Even if it never comes within reach of one person, what about North Korea, or Iran? AI can be jailbroken. The LLM safeguards your argument relies on were put in place by the people you're criticizing for being too safety-conscious. Security through obscurity is no guarantee. |
|
|
|
|
|
| ▲ | InkCanon 11 hours ago | parent | prev | next [-] |
| Also, slightly stretching the definition of terms consecutively, so the multiplicative meaning is really far from the truth. For example, 271 vulnerabilities were really mostly bugs - generally incorrect states, but which almost never led to any exploit. |
| |
| ▲ | Lord-Jobo 11 hours ago | parent [-] | | Yes, an AI making massive gains in bug finding is hugely important and good, it may even lead to a net neutral with the amount of bugs introduced by other AI coding processes, but it’s a far cry from how mythos is portrayed most of the time: a automatic super hacker. | | |
| ▲ | SpicyLemonZest 11 hours ago | parent [-] | | But I think that's a problem with the people portraying it that way, not with Anthropic's messaging. If you've invented "just" a massively more powerful bug finder, it still seems right that you ought to let banks and critical infrastructure providers run it on their systems before it gets in the hands of people who might want to hack them. |
|
|
|
| ▲ | 11 hours ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | jorisw 12 hours ago | parent | prev | next [-] |
| You're not really responding to the piece at all. |
| |
| ▲ | saithound 12 hours ago | parent [-] | | It's an AI-written slop article, which is hugged to death by HN in any case. It claims to be an evidence-based investigation, but basically invents the contents of the documents they supposedly investigated, such as the Anthropic Frontier Red Team writeup, from whole cloth. I don't think deeper engagement with it would promote good discussion. | | |
| ▲ | jorisw 12 hours ago | parent | next [-] | | So you say. I actually read the piece and didn't get AI vibes from it all, except for the graphics | | |
| ▲ | gofreddygo 12 hours ago | parent | next [-] | | there are 31 emdashes in that piece. the domain ends with _ai_ | | |
| ▲ | wood_spirit 12 hours ago | parent | next [-] | | It’s a tangent but two points: First, the reason LLMs learned to like em dashes is that they are common in the training corpus - they are a thing before LLMs that LLMs have learned, not invented? Second, work browser has nice blue swiggles under everything I write into a textbox. I dutifully click through them and accept the rephrasing suggestions. I get a lot of em dashes. My blog posts and whitepapers and stuff are full of them and other “AI tells” - but I think they read better because of it. | |
| ▲ | jorisw 12 hours ago | parent | prev | next [-] | | I use emdashes all the time. They're correct punctuation as opposed to a minus sign. They're easy to type too: opt-shift-minus. If they were such a huge giveaway without ever being used by humans, models would be trained by now not to use them as much. The blog is about AI. So yeah the TLD is .ai | | |
| ▲ | phainopepla2 10 hours ago | parent [-] | | I've never seen writing created before the advent of LLMs that used emdashes in the same way and with the same frequency that LLMs regularly do. There's probably some out there but it would be a real outlier. LLMs overuse them to an absurd degree, putting them where most writers would put commas, occasionally semi-colons, or nothing at all. I count 51 em-dashes on the page, which is extreme. They're also used in places where they don't really belong. It's very obviously LLM-generated, at least in part. That said, it puzzles me why people don't prompt LLMs to change up the writing style a bit and remove some of the tells. |
| |
| ▲ | tiahura 12 hours ago | parent | prev [-] | | I can't imagine why a system designed to reproduce the best writing styles would frequently use em dashes. |
| |
| ▲ | evanelias 11 hours ago | parent | prev [-] | | Take another look at this blog's index https://kingy.ai/category/blog/ and click through more posts, and pay attention to the post dates. Do you really think this singular author is writing multiple excessively-long blog posts about AI per day? There are ~650 of these posts over the past 18 months. And over on LinkedIn, the author describes himself as a "Specialist in Digital Marketing, Videography / Video Editing, Search Engine Optimization, Social Media, and B2B Sales." YMMV but this post and entire site absolutely screams "slop" to me. |
| |
| ▲ | 12 hours ago | parent | prev | next [-] | | [deleted] | |
| ▲ | shimman 12 hours ago | parent | prev [-] | | Don't bother with the slop lovers, these people are anti-human in their souls and willing to follow the most evil people on Earth to the depths of hell; for what? I have zero idea but it's sad to see. | | |
|
|
|
| ▲ | lumost 10 hours ago | parent | prev | next [-] |
| I find it interesting that Mythos was announced the same day that GLM overtook opus4.6 in capability. To me this seems like a careful attempt to cool demand for opensource models which are about to take the overall lead. |
| |
| ▲ | iaw 10 hours ago | parent [-] | | It's remarkable how capable GLM 5.1 is, what's amazing is the recent development of Qwen 3.6 27B being close in real world performance. |
|
|
| ▲ | andai 11 hours ago | parent | prev | next [-] |
| I don't get it. If the older / smaller models are almost as good as Mythos, that sounds like the opposite of comforting. |
|
| ▲ | baq 10 hours ago | parent | prev | next [-] |
| > an incremental improvement I've had to reboot my systems quite a bit more than an incremental improvement would suggest this week |
|
| ▲ | 12 hours ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | FergusArgyll 12 hours ago | parent | prev | next [-] |
| > It's pretty clear at this point that Mythos' capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like ChatGPT Plus/Pro. I'm skeptical of AI takes by someone who thinks there's a model called chatgpt plus. Spend more time working with the current systems! |
| |
| ▲ | saithound 12 hours ago | parent [-] | | It seems like everybody (including you) knew precisely what I meant: the models available for ChatGPT Plus or Pro subscribers, i.e. GPT-5.5 Thinking Extended and the latest Pro. I've edited the offending sentence for clarity just in case. If I got you to be skeptical of AI takes, though, mission accomplished. Exercise your skepticism especially when the takes come from somebody who is trying to sell something. |
|
|
| ▲ | promptunit 12 hours ago | parent | prev [-] |
| [flagged] |