| ▲ | lelanthran 2 hours ago | |
This conclusion: > I am less worried about prompt injection now. Before running this experiment, I expected prompt injection to be much easier than it turned out to be. Is unwarranted. Sure, the agent never output the secret, but did it output anything else? IOW, was it usable? An agent that considers every prompt an attack (and responds accordingly) "passes" this test, while being useless anyway. | ||
| ▲ | doix an hour ago | parent | next [-] | |
Yeah, I remember some ad by an LLM security company hitting HN a year or so with a "challenge" to do prompt injection. The final level was their product and it was impossible. But it was also impossible to get the LLm to do _anything_. May as well just echo "prompt injection attempt detected" at that point and never send anything to an LLM. | ||
| ▲ | CookieCrisp 11 minutes ago | parent | prev [-] | |
Plus, if you're black hat utilizing prompt injection or a living, you're probably unlikely to have been willing to share your methods in this test. This is likely made up mostly of people testing that are not experts in prompt injection | ||