Remix.run Logo
jerf 2 hours ago

Acknowledging that we still only have marketing material, it is their claims on Mythos' ability to auto-generate working exploits that is what actually changes the cost/benefit tradeoffs. Their own Mythos docs showed that it is only a marginal improvement over current models in generation hypotheses about exploits, the difference was finding the exploits automatically (and correctly).

I kind of confirmed this against some of my own code bases. I pointed Opus 4.6 against some internal code bases. It came up with a list of possibilities. The quality of the possibilities was quite mixed and the exploit code generally worthless. So I did at least do a spot check on that aspect of their marketing and it checked out.

The problem is that this changes the attacker versus defender calculus. Right now, the world is basically a big pile of swiss cheese, but we are not all being continuously popped all the time for full access to everything because the exploitation is fundamentally blocked on human attackers analyzing the output of tools, validating the exploits, and then deciding whether or not to use them.

That "whether or not to use them" calculus is also profoundly affected by the fact that they can generally model the exploits they've taken to completion as being fairly likely to uniquely belong to them and not be fixed by the target software, so they have the capability to sit on them because they are not rotting terribly quickly. It is well known that intelligence agencies, when deciding whether or not to attack something, also consider the impact of the possibility of leaking the mechanism they used to attack the user and possibly losing it for future attacks as a result. A particularly well-documented discussion of this in a historical context can be found around how the Allies used the fact they had broken Enigma, but had to be careful exactly how they used the information they obtained that way, lest the Axis work out what the problem was and fix it. All that calculus is still in play today.

The fundamental problem with the claims Mythos made isn't that it can find things that may be vulnerabilities; the fundamental sea change they are claiming is a hugely increased effectiveness in generating the exploits. There's a world of difference in the cost/benefits calculus for attackers and defenders between getting a cheap list of things humans can consider, which was only a quantitative change over the world we've lived in up to this point, and the humans being handed a list of verified (and likely pre-weaponized with just a bit more prompting) vulnerabilities, where the humans at most have to just test it a bit in the lab before putting it in the toolbelt. That is a qualitative change in the attacker's capabilities.

There is also the second-order effect that if everybody can do this, the attackers will stop assuming that they can sit on exploits until a particularly juicy target worth the risk of burning the exploit comes up. That get shifted on two fronts: Exploits are cheaper, so there's less need to worry about burning a particular one, and in a world where everyone has Mythos, everyone is scanning everything all the time with this more sophisticated exploiting firepower and just as likely to find the exploit as the nation-state attackers are, so the attackers need to calculate that they need to use the exploits now, even if it's a lower value attack, because there may not be a later.

If, if, if, if, if the marketing is even half true, this really is a big deal, but it's because of the automated exploit generation that is the sea change, not just finding the vulnerabilities. And especially not finding the same vulnerabilities as Mythos but also including it in a list of many other vulnerabilities that are either not real or not practically exploitable that then bottlenecks on human attention to filter through them. Matching Mythos, or at least Mythos' marketing, means you pushed a button (i.e., simple prompt, not knowing in advance what the vuln is, just feeding it a mass of data) and got exploit. Push button, get big unfiltered list of possible vulnerabilities is not the same. Push button, get correct vulnerability is closer, but still not the same. The problem here is specifically "push button, get exploit".