| ▲ | cookiengineer 5 hours ago | |
Can you elaborate why those bugs weren't found by e.g. fuzzing in the past? I'm genuinely curious what "types" of implementation mistakes these were, like whether e.g. it was library usage bugs, state management bugs, control flow bugs etc. Would love to see a writeup about these findings, maybe Mythos hinted us towards that better fuzzing tools are needed? | ||
| ▲ | IainIreland 4 hours ago | parent | next [-] | |
If I had to guess, I'd say that AI is better at finding TOCTOU bugs than fuzzing because it starts by looking at the code and trying to find problems with it, which naturally leads it to experiment with questions like "is there any way to make this assumption false?", whereas fuzzing is more brute force. Fuzzing can explore way more possible states, but AI is better at picking good ones. In this particular sense, AI tends to find bugs that are closer to what we'd see from a human researcher reading the code. Fuzz bugs are often more "here's a seemingly innocuous sequence of statements that randomly happen to collide three corner cases in an unexpected way". Outside of SpiderMonkey, my understanding is that many of the best vulnerabilities were in code that is difficult to fuzz effectively for whatever reason. | ||
| ▲ | mccr8 4 hours ago | parent | prev [-] | |
Fuzzing isn't good at things like dealing with code behind a CRC check, whereas the audit based approach using an LLMs can see the sketchy code, then calculate the CRC itself to come up with a test case. I think you end up having to write custom fuzzing harnesses to get at the vulnerable parts of the code. (This is an example from a talk by somebody at Anthropic.) That being said, I think there's a lot of potential for synergy here: if LLMs make writing code easier, that includes fuzzers, so maybe fuzzers will also end up finding a lot more bugs. I saw somebody on Twitter say they used an LLM to write a fuzzer for Chrome and found a number of security bugs that they reported. | ||