Remix.run Logo
hn_throwaway_99 2 hours ago

I'm particularly interested if someone with relevant expertise could comment on the types of bugs Mythos found, e.g. the 27 year old OpenBSD bug.

I ask because the media around Mythos is leaning into the "Mythos is a super intelligence that can find bugs that no human can" story. But in my mind it's pretty obvious that any software that is complex enough will have a lot of lurking zero days, and better tools will asymptomatically find more of them. So it seems to me something like Mythos would just be able to do more analysis/searching for bugs at a much faster rate than previously possible. But I'm skeptical that the bugs that were found required an insane amount of analytical abilities to locate, so would really appreciate if someone could comment on that (e.g. was it "yeah, with enough time we would have found it eventually" vs. "Wow, this was an insanely difficult bug to find in the first place")

I do agree that medium/long term that tools like Mythos will be a huge boon for cyber security, because it will inherently make it easier to write bug-free code in the first place. But yeah, we're now at a point where all these "pre-AI bugs" need to be fixed and patched before folks in the wild find all these zero days.

adrian_b 2 hours ago | parent [-]

The OpenBSD bug was more difficult for LLMs, because it is an integer overflow bug, while out-of-bounds accesses are more common bugs that are found by most models.

The OpenBSD bug was also found by GPT-OSS and by Kimi-K2:

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag...

The first condition for finding a bug is to actually audit the code where bugs exist.

When a human does that, this is a lot of work, which is often avoided. LLMs can simplify this, but you must use them for this purpose.

As the link above shows, using multiple older open weights models was enough to find all the bugs found by Mythos.

The improvement demonstrated by Mythos is that it could be used alone to find all those bugs, while with older models you had to run more of them to find everything, because each model would find only a part of the bugs.

Even so, I prefer using all those open weights models together, at a negligible additional cost, while Mythos is unavailable for non-privileged users and even when it will be available for more people it will be much more expensive than the alternatives.