Remix.run Logo
ozgrakkurt 5 days ago

What do you think about leaning on fuzz testing and deriving unit tests from bugs found by fuzzing?

JonChesterfield 5 days ago | parent | next [-]

You end up with a pile of unit tests called things like "regression, don't crash when rhs null" or "regression, terminate on this" which seems fine.

The "did it change?" genre of characterisation/snapshot tests can be created very effectively using a fuzzer, but should probably be kept separate from the unit tests checking for specific behaviour, and partially regenerated when deliberately changing behaviour.

Llvm has a bunch of tests generated mechanically from whatever the implementation does and checked in. I do not rate these - they're thousands of lines long, glow red in code review and I'm pretty sure don't get read by anyone in practice - but because they exist more focused tests do not.

manmal 5 days ago | parent | prev [-]

What kind of bugs do you find this way, besides missing sanitization?

cookiengineer 5 days ago | parent | next [-]

Pointer errors. Null pointer returns instead of using the correct types. Flow/state problems. Multithreading problems. I/O errors. Network errors. Parsing bugs... etc

Basically the whole world of bugs introduced by someone being a too smart C/C++ coder. You can battletest parsers quite nicely with fuzzers, because parsers often have multiple states that assume naive input data structures.

ozgrakkurt 5 days ago | parent | prev | next [-]

You can use the fuzzer to generate test cases instead of writing test cases manually.

For example you can make it generate queries and data for a database and generate a list of operations and timings for the operations.

Then you can mix assertions into the test so you make sure everything is going as expected.

This is very useful because there can be many combinations of inputs and timings etc. and it tests basically everything for you without you needing to write a million unit tests

manmal 5 days ago | parent | next [-]

That sounds worse than letting an LLM dream up tests tbh. I wouldn’t consider grooming a huge number of tests for their usefulness after they‘ve been generated randomly. And just keeping all of them will just lock the implementation in place where it currently is, not validate its correctness.

5 days ago | parent | prev [-]
[deleted]
raddan 5 days ago | parent | prev [-]

You can often find memory errors not directly related to string handling with fuzz testing. More generally, if your program embodies any kind of state machine, you may find that a good fuzzer drives it into states that you did not think should exist.

manmal 5 days ago | parent [-]

That sounds a bit like using a jackhammer to drive in a nail. Wouldn’t it be smarter to enumerate edge cases and test all permutations of those?

quacksilver 5 days ago | parent [-]

Would it even be possible to enumerate all edge cases and test all the permutations of them in non-trivial codebases or interconnected systems? How do you know when you have all of the edge cases?

With fuzzing you can randomly generate bad input that passes all of your test cases that were written using by whatever method you have already been using but still causes the application to crash or behave badly. This may mean that there are more tests that you could write that would catch the issue related to the fuzz case, or the fuzz case itself could be used as a test.

Using probability you can get to 90 or 99% or 99.999% or whatever confidence level you need that the software is unaffected by bugs based on the input size / number of fuzz test cases. In many non-critical situations the goal may not be 100% but 'statistically very unlikely with a known probability and error'

5 days ago | parent | next [-]
[deleted]
manmal 5 days ago | parent | prev [-]

Thanks for elaborating, I might start fuzzing.