| ▲ | causal 5 hours ago |
| Maybe it's marketing, but I think it's regrettable that Anthropic paired project Glasswing with Mythos. It really makes it seem like Mythos is the threat, rather than the fact that tons of vulnerabilities have always been ignored throughout the software world. If Glasswing has been started years ago with the goal of applying fixes to AI-found gaps, then this would just be another model to add to that effort. But doing so in the ominous shadow of some new super model boosts panic IMO. |
|
| ▲ | pixel_popping 5 hours ago | parent | next [-] |
| Cybersecurity is taken too lightly and it mostly boils down to recklessness of developers, they are just "praying" that no-one act on the issues they already know and it's something we must start talking about. Common recklessness obviously include devs running binaries on their work machine, not using basic isolation (why?), sticky IP addresses that straight-up identify them, even worse, using same browsers to access admin panels and some random memes, obviously, hundred more like those that are ALREADY solved and KNOWN by the developers themselves. You literally have developers that still use cleartext DNS (apparently they are ok with their history accessible by random employees outsourced) |
| |
| ▲ | snovymgodym 4 hours ago | parent | next [-] | | > it mostly boils down to recklessness of developers I disagree. I think in big tech and the corporate world, it boils down to the organization fundamentally not valuing security and punishing developers if they "move slow", which is often the outcome when you maintain a highly security-oriented process while developing software and infrastructure. When big leaks happen, the worst that occurs is that some trivial financial penalty is applied to the company so the incentive to ignore security problems until you're forced to acknowledge them is high. | | |
| ▲ | specialist 3 hours ago | parent [-] | | Last gig I had that took QA/Test seriously was late '90s. I have no hopes the situation will improve, for quality or security, until something fundamental changes. |
| |
| ▲ | giantg2 3 hours ago | parent | prev | next [-] | | "Cybersecurity is taken too lightly and it mostly boils down to recklessness of developers, they are just "praying" that no-one act on the issues they already know and it's something we must start talking about." I agree that cyber security is taken too lightly. However, I think that many developers don't actually know about vulnerabilities. In many companies those reports get filter through other teams and prioritized by PMs. The devs tend to do their best at meeting the afressive schedules the penny pinching business people set. | | |
| ▲ | nradov 2 hours ago | parent | next [-] | | Business managers sometimes make bad decisions (at least in retrospect) around budgets and priorities. But the reality is that there are a limited number of pennies, and if someone doesn't pinch them then there are no pennies left to pay developers. | |
| ▲ | pixel_popping 3 hours ago | parent | prev [-] | | I frankly believe that many know what they are doing, take the average freelancer, developing for multiple clients on the same workspace (suicidal and ethically wrong on top of it) without even disk encryption enabled or straight up syncing everything in cleartext to dropbox. | | |
| ▲ | giantg2 3 hours ago | parent [-] | | Or they're a freelancer because they arent good enough for a big salary job |
|
| |
| ▲ | LunaSea 4 hours ago | parent | prev | next [-] | | Highly disagree. It's most of the time a question of management not caring about security or disliking the inconvenience that security can bring. | | |
| ▲ | pixel_popping 4 hours ago | parent [-] | | I agree as well, however for example for FOSS projects, it's exactly as you say, an inconvenience to secure and we comeback to the "I pray that no one exploit X". | | |
| ▲ | LunaSea 3 hours ago | parent [-] | | FOSS projects are a different beast since contributors are working for free and no contributors might have the time to fix a security bug or review a PR fixing one. I might add however that most companies use FOSS projects without paying for or contributing to them. The onus is still on the final user to make sure that the code they use is safe. |
|
| |
| ▲ | causal 5 hours ago | parent | prev | next [-] | | Totally agree, though I'd argue that it's still a software failure if preventing exploits requires every user memorize and follow an onerous list of best practices. | | |
| ▲ | pixel_popping 5 hours ago | parent [-] | | This is where security is actually heavily intertwined with Privacy, by following good privacy principles, you automatically cover a lot of security issues. |
| |
| ▲ | matheusmoreira 3 hours ago | parent | prev | next [-] | | > recklessness of developers Nah. It's the corporations that could not care less and therefore do not reward careful work. They care about nothing but time to market. Start stacking legal and financial liability and I guarantee they are suddenly going to start caring a lot. | |
| ▲ | 5 hours ago | parent | prev | next [-] | | [deleted] | |
| ▲ | sdwr 4 hours ago | parent | prev | next [-] | | Recklessness is based on effort, likelihood, and consequence. If you live in a small town, you might not lock your front door. No matter where you live, you probably don't lock your second floor windows. | | |
| ▲ | pixel_popping 3 hours ago | parent [-] | | Are we doing enough effort tho, AI era invites us to get our shit together as well, we are all guilty of it, but we must also understand that if you live in an area with a high crime rate, you adapt and lock your door, the same must be applied online now that we will have 24/7 rogue agents with sole purpose of doing ransoms and attacks of all kind. |
| |
| ▲ | MrDarcy 3 hours ago | parent | prev | next [-] | | I read your list and all of that is normal computer use. How can it be reckless to use a computer normally? | | |
| ▲ | pixel_popping 3 hours ago | parent [-] | | normal doesn't mean "right", we have piled-up a ton of bad decisions and users that are aware should now better than default settings. |
| |
| ▲ | jacquesm 4 hours ago | parent | prev [-] | | You missed the management factor. And even if managers don't explicitly ask you to build insecure stuff they will up to the pressure to the point that you have no choice or leave the company for someone who will do just that. So the end result is the same. Rarely will individual push back with some force and then they will eventually be let go because they're 'troublemakers'. |
|
|
| ▲ | ofjcihen 3 hours ago | parent | prev | next [-] |
| This. I’ve been hearing panic from the non-security community about Mythos because “zomg z3r0 d4y5!!” Since the announcement but these are the same people running production servers 10 updates and 2 critical security fixes behind for years. I don’t need cutting edge AI to take you down. I need MetaSploit with a CVE list that’s been updated in the last 6 months. |
|
| ▲ | spandrew 4 hours ago | parent | prev | next [-] |
| You're making a hubris-laden assumption coders know the gaps their baking into their software — that any human has a decent enough grip on the multitudes of spinning logic duct taped together to make the internet run. Most vulnerabilities aren't "ignored"; they're in a neverending backlog or unknown. If you closed all of the AI-discovered security vulnerabilities tomorrow - by the next day there'd be a host of new ones. That's software, baby. |
|
| ▲ | gertlabs 3 hours ago | parent | prev | next [-] |
| The strongest model we've benchmarked on our comprehensive, little known, and difficult to game benchmark, is still Claude Opus 4.5 for agentic workflows. That's not a typo. Interpret that how you will, but if Anthropic had to take cost/resource savings measures after the last major release, less than 6 months ago, it's unlikely they have the economics to offer what Mythos is promised to be, at any sort of product scale. But I agree, it would be great to get stronger models and start securing all the junk on the web. Of course, that requires maintainers to know how to use these tools. Benchmarks at https://gertlabs.com/?agentic=all |
|
| ▲ | hn_throwaway_99 2 hours ago | parent | prev | next [-] |
| I'm particularly interested if someone with relevant expertise could comment on the types of bugs Mythos found, e.g. the 27 year old OpenBSD bug. I ask because the media around Mythos is leaning into the "Mythos is a super intelligence that can find bugs that no human can" story. But in my mind it's pretty obvious that any software that is complex enough will have a lot of lurking zero days, and better tools will asymptomatically find more of them. So it seems to me something like Mythos would just be able to do more analysis/searching for bugs at a much faster rate than previously possible. But I'm skeptical that the bugs that were found required an insane amount of analytical abilities to locate, so would really appreciate if someone could comment on that (e.g. was it "yeah, with enough time we would have found it eventually" vs. "Wow, this was an insanely difficult bug to find in the first place") I do agree that medium/long term that tools like Mythos will be a huge boon for cyber security, because it will inherently make it easier to write bug-free code in the first place. But yeah, we're now at a point where all these "pre-AI bugs" need to be fixed and patched before folks in the wild find all these zero days. |
| |
| ▲ | adrian_b 2 hours ago | parent [-] | | The OpenBSD bug was more difficult for LLMs, because it is an integer overflow bug, while out-of-bounds accesses are more common bugs that are found by most models. The OpenBSD bug was also found by GPT-OSS and by Kimi-K2: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... The first condition for finding a bug is to actually audit the code where bugs exist. When a human does that, this is a lot of work, which is often avoided. LLMs can simplify this, but you must use them for this purpose. As the link above shows, using multiple older open weights models was enough to find all the bugs found by Mythos. The improvement demonstrated by Mythos is that it could be used alone to find all those bugs, while with older models you had to run more of them to find everything, because each model would find only a part of the bugs. Even so, I prefer using all those open weights models together, at a negligible additional cost, while Mythos is unavailable for non-privileged users and even when it will be available for more people it will be much more expensive than the alternatives. |
|
|
| ▲ | 3 hours ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | skybrian 5 hours ago | parent | prev | next [-] |
| A year ago the LLM's weren't good enough to find these security issues. They could have done other stuff. But then again, the big tech companies were already doing other stuff, with bug bounties, fuzzing, rewriting key libraries, and so on. This initiative probably could have started a few months sooner with Opus and similar models, though. |
| |
| ▲ | adrian_b 4 hours ago | parent | next [-] | | Using multiple older open weights models can find all the security issues that have been found by Mythos. However, no single model of those could find everything that was found by Mythos. https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... Nevertheless, the distance between free models and Mythos is not so great as claimed by the Anthropic marketing, which of course is not surprising. In general, this is expected to be also true for other applications, because no single model is equally good for everything, even the SOTA models, trying multiple models may be necessary for obtaining the best results, but with open weights models trying many of them may add negligible cost, especially if they are hosted locally. | |
| ▲ | causal 5 hours ago | parent | prev | next [-] | | That's not quite true, even a year ago LLMs were finding vulnerabilities, especially when paired with an agent harness and lots of compute. And even before that security researchers have been shouting about systemic fragility. Mythos certainly represents a big increase in exploitation capability, and we should have anticipated this coming. | | |
| ▲ | Analemma_ 5 hours ago | parent [-] | | A lot of those bugs were found by seasoned developers and security professionals though. Anthropic claims that Mythos is finding vulns from people who have no security background, who just typed "hey, go find a vulnerability in X", went home for the night, and came back the next morning with a PoC ready. They could definitely be an exaggerating, but if it's true that's a very different threat category which is worth paying attention to. | | |
| ▲ | qingcharles 5 hours ago | parent | next [-] | | Previous models have done this just fine. For the last year, whenever a new model has come out I just point it at some of my repos and say something like "scan this entire codebase, look for bugs, overengineering, security flaws etc" and they always find a few useful things. Obviously each new model does this better than the last, though. | |
| ▲ | causal 5 hours ago | parent | prev [-] | | Yes, previous models found vulnerabilities but Mythos is uniquely capable of actually exploiting them: https://red.anthropic.com/2026/mythos-preview/ | | |
| ▲ | pxc 5 hours ago | parent [-] | | Imo that's a big deal primarily because the issue with automatically discerned vulnerabilities has long been a high volume of reports and a very bad signal-to-noise ratio. When an LLM is capable of developing PoC exploits, that means you finally have a tool that enables meaningfully triaging reports like this. |
|
|
| |
| ▲ | pixel_popping 5 hours ago | parent | prev | next [-] | | If you run Opus 4.6 and GPT 5.4 in a loop right now (maybe 100 times) against top XXXX repos, I guarantee you that you'll find at the very least, medium vulnerabilities. | |
| ▲ | alephnerd 4 hours ago | parent | prev | next [-] | | > A year ago the LLM's weren't good enough to find these security issues I know of two F100s that already started using foundation models for SCA in tandem with other products back in 2024. It's noisy, but a false positive is less harmful than an undetected true positive depending on the environment. | |
| ▲ | vonneumannstan 5 hours ago | parent | prev [-] | | >This initiative probably could have started a few months sooner with Opus and similar models, though. Evidently they tried and even the most recent Opus 4.6 models couldn't find much. Theres been a step change in capabilities here. | | |
|
|
| ▲ | SpicyLemonZest 5 hours ago | parent | prev [-] |
| I guess I'm not sure why you frame this as a "rather than". What Anthropic is saying is that the norm of having tons of vulnerabilities lying around historically worked OK, but Mythos shows it will soon become catastrophically not OK, and everyone who's responsible for software security needs to know this so they can take action. |