Remix.run Logo
firer 3 hours ago

This is very similar in root cause and exploitation to Copy Fail.

Which illustrates pretty well something that's lost when relying heavily on LLMs to do work for you: exploration.

I find that doing vulnerability research using AI really hinders my creativity. When your workflow consists of asking questions and getting answers immediately, you don't get to see what's nearby. It's like a genie - you get exactly what you asked for and nothing more.

The researcher who discovered Copy Fail relied heavily on AI after noticing something fishy. If he had to manually wade through lots of code by himself, he would have many more chances to spot these twin bugs.

At the same time, I'm pretty sure that by using slightly less directed prompting, a frontier LLM would found these bugs for him too.

It's a very unusual case of negative synergy, where working together hurt performance.

eqvinox 3 hours ago | parent | next [-]

No, unless I'm misreading it it's the *same* root cause: high 32 bits of Extended ESN in IPsec == authencesn module/cipher mode.

The wrong thing got fixed for copy.fail, because people jumped to blame AF_ALG.

[ed.: yes it's the same authencesn issue. https://github.com/V4bel/dirtyfrag/blob/892d9a31d391b7f0fccb... it doesn't say authencesn in the code, only in a comment, but nonetheless, same issue.]

[ed.2: the RxRPC issue is separate, this is about the ESP one]

firer 2 hours ago | parent [-]

There are two vulnerabilities here.

The RxRPC one is definitely a different root cause (although caused by a very similar mistake).

For the ESP one it's a bit harder to tell. I don't think the wrong thing was fixed, just that there was a very similar bug in almost the same spot. Could be wrong about that though.

eqvinox 2 hours ago | parent [-]

(you probably wrote this while I was editing my post.)

It's absolutely the same issue in authencesn/ESP. There's another one in RxRPC that is AIUI completely unrelated.

papascrubs 2 hours ago | parent | prev | next [-]

Or a follow up prompt: "find similar classes of bugs". Once the actual case has been layed out finding like bugs isn't too hard. I hear you on the creativity bit. Like any tool, AI can put blinders on. Using it to augment without it fully taking over your workflow is tough.

tptacek 2 hours ago | parent | prev | next [-]

I don't follow. LLMs spotted these bugs in the first place. You seem to be saying that these discoveries are indications that they're bad for vulnerability discovery.

firer 2 hours ago | parent | next [-]

From what I understand, the copy fail bug was found by researcher who noticed something weird and then using AI to scan the codebase for instances where that becomes a problem.

I bet that with a slightly looser prompt/harness, the LLM could have found these twin bugs too.

Yet at the same time, I also think that if the human researcher had manually scanned the code, he'd have noticed these bugs too.

FWIW I do think LLMs are great tools for finding vulnerabilities in general. Just that they were visibly not optimally applied in this case.

eqvinox 2 hours ago | parent | prev | next [-]

I don't think the copy.fail people understood the issue they found, as is evident by the heavy focus on AF_ALG/aead_algif, which is essentially "innocent" as we're seeing here.

I think LLMs are great for vulnerability discovery, but you need to not skimp on the legwork and understanding what even you just found there.

tptacek 2 hours ago | parent [-]

Right but without the LLM the bug doesn't get found at all.

eqvinox 2 hours ago | parent [-]

Yes, I agree. I'm not the GP poster.

parliament32 2 hours ago | parent | prev [-]

No, they did not. Careful of falling for the psychosis.

> This finding was AI-assisted, but began with an insight from Theori researcher Taeyang Lee, who was studying how the Linux crypto subsystem interacts with page-cache-backed data.

https://xint.io/blog/copy-fail-linux-distributions

tptacek 2 hours ago | parent [-]

Theori is an AI security research firm.

danudey an hour ago | parent [-]

It seems as though this issue occurred to him, then he used their tool ("Xint Code") to analyze the codebase for instances of it.

refulgentis an hour ago | parent | prev | next [-]

It’s very hard to see a root vuln similar to, but not the same as, another discovered by AI, as a lesson about AI not exploring.

Is there a counterfactual where you would say it explored well enough, besides both vulnerabilities published as one?

varispeed an hour ago | parent | prev | next [-]

> When your workflow consists of asking questions and getting answers immediately, you don't get to see what's nearby.

That's why is very very important to just step out and use saved time to go for a walk, to a park, sit on a bench, listen do birds, close eyes and zoom out.

The state we are in is actually brilliant.

formerly_proven 3 hours ago | parent | prev [-]

These are all page cache poisoning attacks (dirtyfrag, copyfail, dirtypipe). Maybe the page cache should have defense-in-depth measures for SUID binaries?

firer 3 hours ago | parent [-]

SUID mitigations have nothing to do with the vulnerability itself - just the exploit.

If there's a root cronjob that runs a world readable binary, you could modify it in the page cache and exploit it that way.

Modifying the page cache is a really strong primitive with countless ways to exploit it.

eqvinox 2 hours ago | parent | next [-]

splice() should maybe generally refuse to operate on things you can't write to.

toast0 an hour ago | parent [-]

splice is documented to return EBADF if "One or both file descriptors are not valid, or do not have proper read-write mode."

So it seems surprising to me that you can call it when the out fd is not writable? But I didn't retain the information about the vulnerability, so I'm missing something. There was something about copy on write, IIRC?

eqvinox an hour ago | parent [-]

"proper read-write mode" for the input fd is reading only. The exploit is writing to the splice() input fd.

Also, NB, I said permission check, not mode check. The input fd to splice can and will be open for only reading quite often. Doesn't mean the kernel can't still do a write permission check.

(Except I didn't say that here. Oops. Getting confused with my posts.)

toast0 an hour ago | parent [-]

OK, I may likely have too much sleep debt to understand, but given the bug is that splice can write to the input fd, you're suggesting maybe splice should only let you use an input fd if the process has access to write to it?

But splice is a more or less a generalization of sendfile, and sendfile is often used for webserving where the serving process does not have ownership of the documents it is serving. It doesn't make sense to limit splice such that it can't do the task it was built for. Maybe splice should just not write to the input fd? :P

eqvinox 12 minutes ago | parent | next [-]

Yes, it'd curtail splice() usage quite heavily. Maybe too much.

But apparently we can't be trusted with the page cache…

Maybe the kernel using supervisor-read-only flags could be made to work, only issue then is what happens if something does in fact need to write…

semiquaver 38 minutes ago | parent | prev [-]

Aren’t you just saying “don’t write bugs?”

formerly_proven 2 hours ago | parent | prev [-]

True! Building protections (e.g. physical pages in the page cache are not writeable 100% of the time) just for executables has of course countless circumventions as well (e.g. config files). Yeah, there is probably not that much to be done there, actually. Looking at some of the diffs it seems to me like the kernel makes it really not particularly obvious when/how this goes wrong. E.g. the patch for this is to look at an additional flag on the socket buffer to fix an arbitrary page cache write. This feels rather action at a distance. Logically this of course makes sense, the whole point of splice et al is to feed data from one file-like into another file-like, whatever those ends might be. That erases the underlying provenance of the data.