This is nice work, but I found the bug finding example to be weird:

> One such bug was in the sign function for zigzag decoding of the datrs/varinteger library. On input Std.U64.MAX, the expression (value + 1) overflowed, causing crashes in debug mode and silent corruption in release mode—an edge case that testing and fuzzing would typically miss.

In what way would this boundary condition case be considered something that "testing [...] would typically miss"? It's certainly something that bad tests would miss or not think about, but I find that (a) careful people and (b) ML coding systems are actually really good at "oh, I should test the extreme values". Especially for things that parse user input.

I'm curious if they found other bugs that were more interesting, but found them too hard to explain quickly.

▲

fjdjshsh an hour ago | parent | next [-]

Maybe it's not something they would "typically miss", but, from proof by existence, it's something they sometimes miss.

It does speak to the benefits of using lean in that you don't need to be clever about the different examples you test.

▲

Exoristos an hour ago | parent | prev | next [-]

Yes, it's basic QA. If tests missed this kind of thing, they would be of much more limited use than we generally expect them to be. It raises questions about the authors' background.

▲

pierrefermat1 2 hours ago | parent | prev [-]

When you dogfood your AI slop all day, suddenly everything becomes impressive

	▲	fjdjshsh an hour ago \| parent [-]
		I feel this is a lazy, straw-manning comment.