| ▲ | nate a day ago | |||||||
I'm about to launch an agent I made. Got an A+. One big reason it did so well though, right or wrong, is the agent screenshots sites and uses those to interpret what the hell is going on. So obviously removes the secret injections you can't see visibly. But also has some nice properties of understanding the structure of the page after it's rendered and messed with javascript wise. e.g. "Click on an article" makes more sense from the image than traversing the page content looking for random links to click. Of course, it's kinda slow :) | ||||||||
| ▲ | joozio a day ago | parent [-] | |||||||
That's a really interesting edge case - screenshot-based agents sidestep the entire attack surface because they never process raw HTML. All 10 attacks here are text/DOM-level. A visual-only agent would need a completely different attack vector (like rendered misleading text or optical tricks). Might be worth exploring as a v2. | ||||||||
| ||||||||