This is the conclusion I'm at too, working on a relatively new codebase. Our rule is that every generated test must be human reviewed, otherwise its an autodelete.