| ▲ | rando77 2 days ago | |||||||||||||||||||||||||
Sounds like they need another agent to detect false positives (I joke, I joke) | ||||||||||||||||||||||||||
| ▲ | dotty- 2 days ago | parent [-] | |||||||||||||||||||||||||
You joke, but that's a very real approach that AI pentesting companies do take: an agent that creates reports, and an agent that 'validates' reports with 'fresh context' and a different system prompt that attempts to reproduce the vulnerability based on the report details. *Edit: the paper seems to suggest they had a 'Triager' for vulnerability verification, and obviously that didn't catch all the false positives either, ha. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||