| ▲ | paulvnickerson 6 hours ago | ||||||||||||||||||||||||||||
What types of vulnerabilities was it finding? Cross site scripting, privilege escalation, etc? Mostly memory corruption or any Javascript logic bugs? | |||||||||||||||||||||||||||||
| ▲ | IainIreland 5 hours ago | parent | next [-] | ||||||||||||||||||||||||||||
I work on SpiderMonkey, so I mostly looked at the JS bugs. It was a smorgasbord of various things. Broadly speaking I'd say the most impressive bugs were TOCTOU issues, where we checked something and later acted on it, and the testcase found a clever way to invalidate the result of the check in between. If you look closely at, say, this patch, you might get a sense of what I mean (although the real cleverness is in the testcase, which we have not made public): https://hg-edge.mozilla.org/integration/autoland/rev/c29515d... | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | mccr8 5 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||
I'd say it leans towards memory corruption kinds of issues, as those are easiest to pass the validator, thanks to AddressSanitizer. I think there's a lot of potential for making the validator more sophisticated. Like maybe you add a JS function that will only crash when run in the parent process and have a validator that checks for that specific crash, as a way for the LLM to "prove" that it managed to run arbitrary JS in the parent. Would that turn up subtler issues? Maybe. | |||||||||||||||||||||||||||||