| ▲ | tptacek a day ago | ||||||||||||||||||||||
I have never heard of "Heidy Khlaaf, chief AI scientist at the AI Now Institute", but the sentiment in this article is diametrically opposite that of the vulnerability research scene. There is contention among vulnerability researchers about the impact of Mythos! But it's not "are frontier models going to shake up vulnerability research and let loose a deluge of critical vulnerabilities" --- software security people overwhelmingly believe that to be true. Rather, it's whether Mythos is truly a step change from 4.7 and 5.5. For vulnerability researchers, the big "news" wasn't Mythos, but rather Carlini's talk from Unprompted, where he got on stage and showed his dumb-seeming "find me zero days" prompt, which actually worked. The big question for vulnerability people now isn't "AI or no AI"; it's "running directly off the model, or building fun and interesting harnesses". Later I spoke with someone who has been professionally acquainted with Khlaaf. Khlaaf is a serious researcher, but not a software security researcher; it's not their field. I think what's happening here is that the BBC doesn't know the difference between AI safety prognosis and software security prognosis, or who to talk to for each topic. | |||||||||||||||||||||||
| ▲ | adrian_b a day ago | parent [-] | ||||||||||||||||||||||
I doubt very much that a "find me zero days" prompt worked, because I am not aware of the slightest evidence about this. The Anthropic report that describes the bugs they have found with Mythos in various open-source projects admits that a prompt like "find me zero days" does not work with Mythos. To find bugs, they have run Mythos a large number of times on each file of the scanned project, with different prompts. They have started with a more generic prompt intended to discover whether there are chances to find bugs in that file, in order to decide whether it is worthwhile to run Mythos many times on that file. Then they have used more and more specific prompts, to identify various classes of bugs. Eventually, when it was reasonably certain that a bug exists, Mythos was run one more time, with a prompt requesting the confirmation that the identified bug exists (and the creation of an exploit or patch). Because what you say about Carlini is in obvious contradiction with the technical report about Mythos of Anthropic, I assume that is was just pure BS or some demo run on a fake program with artificial bugs. Or else the so-called prompt was not an LLM prompt, but just the name of a command for a bug-finding harness, which runs the LLM in a loop, with various suitable prompts, as described by Anthropic. | |||||||||||||||||||||||
| |||||||||||||||||||||||