This really only works well if you have a TLA+ style formal model of the algorithm and can use it to generate lots of unit tests.