| ▲ | dataviz1000 a day ago | |
The red team adversaries are so effective. If Claude is blind to a bug, it won't surface using the same model from a red team adversary perspective. It requires using a different model which gpt-5.5 is great for. Yesterday I tried for the first time using gpt-5.5 as a adversary against the tests themselves. Later I thought it would be interesting to create a trickster agent which breaks the code after copy entire project into /tmp/ in order to control every aspect of it. Claude insists this called mutation testing. It would create regressions and then run all the tests. Finally it was able to unsupervised create an effective test harness. | ||