▲ | jeffrallen 5 days ago | |
My experience with a really clever agentic workflow (I use sketch.dev) is that the LLM is playing both blue and red team. If I give a good spec, it will make the thing I'm asking for, and then it will test it better than I would have done myself (partly because it's more clever than me, but mostly because it's way harder working than I am, or rather it puts more effort into testing that I would be able to do with the time leftover after writing the thing). Also, I cam ask it to do security reviews on the system it's made and it works with it's same characteristic fervor. I love Tao's observation, but I disagree, at least for the domains I'm allowing LLMs to creat for, that they should not play both teams. |