Remix.run Logo
adrian_b 3 hours ago

The OpenBSD bug was more difficult for LLMs, because it is an integer overflow bug, while out-of-bounds accesses are more common bugs that are found by most models.

The OpenBSD bug was also found by GPT-OSS and by Kimi-K2:

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag...

The first condition for finding a bug is to actually audit the code where bugs exist.

When a human does that, this is a lot of work, which is often avoided. LLMs can simplify this, but you must use them for this purpose.

As the link above shows, using multiple older open weights models was enough to find all the bugs found by Mythos.

The improvement demonstrated by Mythos is that it could be used alone to find all those bugs, while with older models you had to run more of them to find everything, because each model would find only a part of the bugs.

Even so, I prefer using all those open weights models together, at a negligible additional cost, while Mythos is unavailable for non-privileged users and even when it will be available for more people it will be much more expensive than the alternatives.

hn_throwaway_99 16 minutes ago | parent [-]

Thank you so much. Your comment and the linked blog post is exactly the deep analysis/explanation I was looking for.