| ▲ | magicalhippo 5 hours ago |
| Key point is that Claude did not find the bug it exploits. It was given the CVE writeup[1] and was asked to write a program that could exploit the bug. That said, given how things are I wouldn't be surprised if you could let Claude or similar have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs. If not now, then surely not in a too distant future. [1]: https://www.freebsd.org/security/advisories/FreeBSD-SA-26:08... |
|
| ▲ | muskstinks an hour ago | parent | next [-] |
| You might want to watch this: https://www.youtube.com/watch?v=1sd26pWhfmg Claude is already able to find CVEs on expert level. |
| |
| ▲ | shimman 25 minutes ago | parent | next [-] | | A talk given by an employee that stands to make millions from Anthropic going public, definitely not a conflict of interest by the individual. | | |
| ▲ | pama 8 minutes ago | parent | next [-] | | It is by the individual who (also with Claude) found the specific vulnerability used in this exploit. | |
| ▲ | muskstinks 22 minutes ago | parent | prev [-] | | I didn't say "watch this without critical thinking". The chance this is completly fabricated though is very low and its an highly interesting signal to many others. There was also a really good AI CTF Talk at 39c3 hacker conference just 4 month ago. |
| |
| ▲ | 27 minutes ago | parent | prev | next [-] | | [deleted] | |
| ▲ | Bender 11 minutes ago | parent | prev [-] | | Claude is already able to find CVEs on expert level. Does it fix them as fast as it finds them? Bonus if it adds snarky code comments |
|
|
| ▲ | ogig 3 hours ago | parent | prev | next [-] |
| Setting up fuzzing used to be hard. I haven't tried yet, but my bet is having Claude Code, today, analyze a codebase and suggest where and how to fuzztest it and having it review the crashes and iterate, will produce CVEs. |
|
| ▲ | Cloudef 3 hours ago | parent | prev | next [-] |
| You can let agent churn unattended if you have some sort of known goal. Write a test that should not pass and then tell the agent to come up with something that passes the test without changing the test itself. For this kind of fuzzing llms are not bad. |
| |
| ▲ | vinnymac 2 hours ago | parent [-] | | When doing this remove write permissions on the test file, it will do a much better job of staying the course over long periods. I've been doing this for over a year now. |
|
|
| ▲ | fragmede 5 hours ago | parent | prev | next [-] |
| > Credits: Nicholas Carlini using Claude, Anthropic Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude, so while there are some very talented humans in the loop, Claude is quite involved with the whole process. |
| |
| ▲ | magicalhippo 5 hours ago | parent | next [-] | | > Claude was used to find the bug in the first place though. That CVE write-up happened because of Claude Do you have a link to that? A rather important piece of context. Wasn't trying to downplay this submission the way, the main point still stands: But finding a bug and exploiting it are very different things. Exploit development requires understanding OS internals, crafting ROP chains, managing memory layouts, debugging crashes, and adapting when things go wrong. This has long been considered the frontier that only humans can cross. Each new AI capability is usually met with “AI can do Y, but only humans can do X.” Well, for X = exploit development, that line just moved. | | | |
| ▲ | bayindirh 3 hours ago | parent | prev [-] | | Yes, that claim needs a source. |
|
|
| ▲ | petcat 5 hours ago | parent | prev [-] |
| > have a go at the source code of the kernel or core services, armed with some VMs for the try-fail iteration, and get it pumping out CVEs. FreeBSD kernel is written in C right? AI bots will trivially find CVEs. |
| |
| ▲ | pjmlp 4 hours ago | parent [-] | | The Morris worm lesson is yet to be taken seriously. | | |
| ▲ | pitched 4 hours ago | parent [-] | | We’re here right now looking at a CVE. That has to count as progress? |
|
|