Remix.run Logo
david_shaw 5 hours ago

There's a lot of skepticism in the security world about whether AI agents can "think outside the box" enough to replicate or augment senior-level security engineers.

I don't yet have access to Claude Code Security, but I think that line of reasoning misses the point. Maybe even the real benefit.

Just like architectural thinking is still important when developing software with AI, creative security assessments will probably always be a key component of security evaluation.

But you don't need highly paid security engineers to tell you that you forgot to sanitize input, or you're using a vulnerable component, or to identify any of the myriad issues we currently use "dumb" scanners for.

My hope is that tools like this can help automate away the "busywork" of security. We'll see how well it really works.

ping00 an hour ago | parent | next [-]

as a pentester at a Fortune 500: I think you're on the mark with this assessment. Most of our findings (internally) are "best practices"-tier stuff (make sure to use TLS 1.2, cloud config findings from Wiz, occasionally the odd IDOR vuln in an API set, etc.) -- in a purely timeboxed scenario, I'd feel much more confident in an agent's ability to look at a complex system and identify all the 'best practices' kind of stuff vs a human being.

Security teams are expensive and deal with huge streams of data and events on the blue side: seems like human-in-the-loop AI systems are going to be much more effective, especially with the reasoning advances we've seen over the past year or so.

tptacek 6 minutes ago | parent [-]

Every conversation I've been a party to has been premised on humans in the loop; I think fully-automated luxury space vulnerability research is something that only exists in message board imaginations.

samuelknight 3 hours ago | parent | prev | next [-]

LLMs and particularly Claude are very capable security engineers. My startup builds offensive pentesting agents (so more like red teaming), and if you give it a few hours to churn on an endpoint it will find all sorts of wacky things a human won't bother to check.

tptacek 5 hours ago | parent | prev | next [-]

I am seeing something closer to the opposite of skepticism among vulnerability researchers. It's not my place to name names, but for every Halvar Flake talking publicly about this stuff, there are 4 more people of similar stature talking privately about it.

decidu0us9034 3 hours ago | parent [-]

People use whatever tools are the most effective and they have plenty of incentive not to talk publicly about them. I think the era of openness has passed us by. But why does stature matter anyway? If I look at chromium or MSRC bug reports, scarcely any of the submitters are from Europe/US and certainly don't have anything resembling stature. That guy hasn't done anything of note in the field in a long time from what I know, he's kind of boomer (you too, no disrespect).

lich_king a minute ago | parent [-]

Vulnerability research is exciting and profitable, but it has three problems. First, it's mentally exhausting. Second, the income it generates is very unpredictable. Third, it's sort of... futile. You can find a 1,000 vulnerabilities and nothing changes.

So yeah, it's the domain of young folks, often from countries where $10k or $100k goes much farther than in the US. But what happens to vulnerability researchers once they turn 35? They often end up building product security programs or products to move the needle, often out of the limelight because they no longer have anything to prove.

When you dunk on "boomers", you're dunking on people who are running teams that build browser mitigations, detection mechanisms, in-house fuzzing infrastructure for companies like Nvidia, and so forth. And yeah, they often know what they're doing and they're the one writing checks for the young uns to test these defenses and report more bugs.

awestroke 5 hours ago | parent | prev [-]

Claude Opus 4.6 has been amazing at identifying security vulnerabilities for us. Less than 50% falae positives.

john_strinlai 5 hours ago | parent [-]

[dead]