| ▲ | mickayz 3 days ago | |
Hey folks, one of the authors of the original post here. First, I want to thank simonw for coming up with the lethal trifecta (our direct inspiration for this work) as well as all of the great feedback we’ve received from Simon and others! Our goal with publishing this framework was to inspire precisely these types of discussions so our industry can move our understanding of these risks forward. Regarding the concerns over the venn diagram labeling certain intersections sections as “safe”, this is 100% valid and we’ve updated it to be more clear. The goal of the Rule of Two is not to describe a sufficient level of security for agents, but rather a minimum bar that’s needed to deterministically prevent the highest security impacts of prompt injection. The earlier framing of “safe” did not make this clear. Beyond prompt injection there are other risks that have to be considered, which we briefly describe in the Limitations section of the post. That said, we do see value in having the Rule of Two to frame some of the discussions around what unambiguous constraints exist today because of the unsolved risk of prompt injection. Looking forward to further discussion! | ||