| ▲ | userbinator 18 hours ago |
| Not "hidden", but probably more like "no one bothered to look". declares a 1024-byte owner ID, which is an unusually long but legal value for the owner ID. When I'm designing protocols or writing code with variable-length elements, "what is the valid range of lengths?" is always at the front of my mind. it uses a memory buffer that’s only 112 bytes. The denial message includes the owner ID, which can be up to 1024 bytes, bringing the total size of the message to 1056 bytes. The kernel writes 1056 bytes into a 112-byte buffer This is something a lot of static analysers can easily find. Of course asking an LLM to "inspect all fixed-size buffers" may give you a bunch of hallucinations too, but could be a good starting point for further inspection. |
|
| ▲ | tptacek 10 hours ago | parent | next [-] |
| "No one bothered to look" is how most vulnerabilities work. Systems development produces code artifacts with compounding complexity; it is extraordinarily difficult to keep up with it manually, as you know. A solution to that problem is big news. Static analyzers will find all possible copies of unbounded data into smaller buffers (especially when the size of the target buffer is easily deduced). It will then report them whether or not every path to that code clamps the input. Which is why this approach doesn't work well in the Linux kernel in 2026. |
| |
| ▲ | rubendev 9 hours ago | parent [-] | | With a capable static analyzer that is not true. In many common cases they can deduce the possible ranges of values based on branching checks along the data flow path, and if that range falls within the buffer then it does not report it. | | |
| ▲ | tptacek 8 hours ago | parent [-] | | Be specific. Which analyzer are you talking about and which specific targets are you saying they were successful at? | | |
| ▲ | canucker2016 5 hours ago | parent [-] | | Intrinsa's PREfix static source code analyzer would model the execution of the C/C++ code to determine values which would cause a fault. IIRC they were using a C/C++ compiler front end from EDG to parse C/C++ code to a form they used for the simulation/analysis. see https://web.eecs.umich.edu/~weimerw/2006-655/reading/bush-pr... for more info. Microsoft bought Intrinsa several years ago. | | |
| ▲ | tptacek 4 hours ago | parent [-] | | I'm sure this is very interesting work, but can you tell me what targets they've been successful surfacing exploitable vulnerabilities on, and what the experience of generating that success looked like? I'm aware of the large literature on static analysis; I've spent most of my career in vulnerability research. | | |
| ▲ | canucker2016 2 hours ago | parent [-] | | PREfix wasn't designed specifically for finding exploitable bugs - it was aimed somewhere in between Purify (runtime bug detection) and being a better lint. One of the articles/papers I recall was that the big problem for PREfix when simulating the behaviour of code was the explosion in complexity if a given function had multiple paths through it (e.g. multiple if's/switch statements). PREfix had strategies to reduce the time spent in these highly complex functions. Here's a 2004 link that discusses the limitations of PREfix's simulated analysis - https://www.microsoft.com/en-us/research/wp-content/uploads/... The above article also talks about Microsoft's newer (for 2004) static analysis tools. There's a Netscape engineer endorsement in a CNet article when they first released PREfix. see https://www.cnet.com/tech/tech-industry/component-bugs-stamp... |
|
|
|
|
|
|
| ▲ | mrshadowgoose 11 hours ago | parent | prev | next [-] |
| > Not "hidden", but probably more like "no one bothered to look". Well yeah. There weren't enough "someones" available to look. There are a finite number of qualified individuals with time available to look for bugs in OSS, resulting in a finite amount of bug finding capacity available in the world. Or at least there was. That's what's changing as these models become competent enough to spot and validate bugs. That finite global capacity to find bugs is now increasing, and actual bugs are starting to be dredged up. This year will be very very interesting if models continue to increase in capability. |
| |
| ▲ | literalAardvark 7 hours ago | parent [-] | | I was just thinking about this and what it means for closed source code. Many people with skin in the game will be spending tokens on hardening OSS bits they use, maybe even part of their build pipelines, but if the code is closed you have to pay for that review yourself, making you rather uncompetitive. You could say there's no change there, but the number of people who can run a Claude review and the number of people who can actually review a complicated codebase are several orders of magnitude apart. Will some of them produce bad PRs? Probably. The battle will be to figure out how to filter them at scale. | | |
| ▲ | dolmen 5 hours ago | parent [-] | | I have no doubt that LLMs can be as good at analyzing binaries than at analyzing source code. An avalanche of 0-day in proprietary code is coming. |
|
|
|
| ▲ | NitpickLawyer 18 hours ago | parent | prev [-] |
| > This is something a lot of static analysers can easily find. And yet they didn't (either noone ran them, or they didn't find it, or they did find it but it was buried in hundreds of false positives) for 20+ years... I find it funny that every time someone does something cool with LLMs, there's a bunch of takes like this: it was trivial, it's just not important, my dad could have done that in his sleep. |
| |
| ▲ | userbinator 18 hours ago | parent | next [-] | | Remember Heartbleed in OpenSSL? That long predated LLMs, but same story: some bozo forgot how long something should/could be, and no one else bothered to check either. | | |
| ▲ | sam_bristow 5 hours ago | parent | next [-] | | I believe that once the OpenBSD team started cleaning up some of the other gross coding style stuff as part of their fork into LibreSSL that even fairly simplistic static analysis tools could spot the underlying bugs that caused heartbleed. | | |
| ▲ | tptacek 3 hours ago | parent [-] | | The bug that caused Heartbleed was extremely obvious: read a u16 out of a packet, copy that many bytes of the source packet into the reply packet. If someone put that code in front of you in isolation you would spot it instantly (if you know C). The problem --- this is hugely the case with most memory safety bugs --- is that it's buried under a mountain of OpenSSL TLS protocol handling details. You have to keep resident in your brain what all the inputs to the function are, and follow them through the code. |
| |
| ▲ | dlopes7 13 hours ago | parent | prev [-] | | Hey we are the bozos | | |
| |
| ▲ | choeger 12 hours ago | parent | prev | next [-] | | It's much, much, easier to run an LLM than to use a static or dynamic analyzer correctly. At the very least, the UI has improved massively with "AI". | | |
| ▲ | pixl97 5 hours ago | parent [-] | | Most people have no idea how hard it is to run static analysis on C/C++ code bases of any size. There are a lot of ways to do it wrong that eat a ton of memory/CPU time or start pruning things that are needed. If you know what you're doing you can split the code up in smaller chunks where you can look with more depth in a timely fashion. |
| |
| ▲ | mrshadowgoose 11 hours ago | parent | prev | next [-] | | And even if that's true (and it frequently is!), detractors usually miss the underlying and immense impact of "sleeping dad capability" equivalent artificial systems. Horizontally scaling "sleeping dads" takes decades, but inference capacity for a sleeping dad equivalent model can be scaled instantly, assuming one has the hardware capacity for it. The world isn't really ready for a contraction of skill dissemination going from decades to minutes. | |
| ▲ | pjmlp 14 hours ago | parent | prev | next [-] | | Most likely no-one runned them, given the developer culture. | |
| ▲ | wat10000 5 hours ago | parent | prev [-] | | There’s the classic case of the Debian OpenSSL vulnerability, where technically illegal but practically secure code was turned into superficially correct but fundamentally insecure code in an attempt to fix a bug identified by a (dynamic, in this case) analyzer. |
|