| ▲ | padolsey 12 hours ago | |||||||
> What would the IoCs even be? Prompts. | ||||||||
| ▲ | EMM_386 12 hours ago | parent [-] | |||||||
The prompts aren't the key to the attack, though. They were able to get around guardrails with task decomposition. There is no way for the AI system to verify whether you are white hat or black hat when you are doing pen-testing if the only task is to pen-test. Since this is not part of a "broader attack" (in the context), there is no "threat". I don't see how this can be avoided, given that there are legitime uses to every step of this in creating defenses to novel attacks. Yes, all of this can be done with code and humans as well - but it is the scale and the speed that becomes problematic. It can adjust in real-time to individual targets and does not need as much human intervention / tailoring. Is this obvious? Yes - but it seems they are trying to raise awareness of an actual use of this in the wild and get people discussing it. | ||||||||
| ||||||||