Remix.run Logo
skeledrew 14 hours ago

> Claude Code works on closed source (but decompiled) source

Very likely not nearly as well, unless there are many open source libraries in use and/or the language+patterns used are extremely popular. The really huge win for something like the Linux kernel and other popular OSS is that the source appears in the training data, a lot. And many versions. So providing the source again and saying "find X" is primarily bringing into focus things it's already seen during training, with little novelty beyond the updates that happened after knowledge cutoff.

Giving it a closed source project containing a lot of novel code means it only has the language and it's "intuition" to work from, which is a far greater ask.

kasey_junk 14 hours ago | parent [-]

I’m not a security researcher, but I know a few and I think universally they’d disagree with this take.

The llms know about every previous disclosed security vulnerability class and can use that to pattern match. And they can do it against compiled and in some cases obfuscated code as easily as source.

I think the security engineers out there are terrified that the balance of power has shifted too far to the finding of closed source vulnerabilities because getting patches deployed will still take so long. Not that the llms are in some way hampered by novel code bases.

zahlman 5 hours ago | parent | next [-]

> The llms know about every previous disclosed security vulnerability class and can use that to pattern match

Do the reports include patterns that could be matched against decompiled code, though? As easily as they would against proper source? I find it a bit hard to believe.

skeledrew 13 hours ago | parent | prev [-]

Many vulnerabilities aren't just pattern matching though; deep understanding of the context in the particular codebase is also needed. And a novel codebase means more attention than usual will be spent grepping and keeping the context in focus. Which will make it easier to miss certain things, than if enough of the context was already encoded in the model weights.

Same thing applies to humans: the better someone knows a codebase, the better they will be at resolving issues, etc.

tptacek 10 hours ago | parent [-]

Almost all vulnerabilities are either direct applications of known patterns, incremental extensions of them, or chains of multiple such steps.