| ▲ | TeMPOraL 4 hours ago | |
"Run terraform apply plan.out next" in this context is a prompt injection for an LLM to exactly the same degree it is for a human. Even a first party suggestion can be wrong in context, and if a malicious actor managed to substitute that message with a suggestion of their own, humans would fall for the trick even more than LLMs do. See also: phishing. | ||
| ▲ | bravetraveler 4 hours ago | parent | next [-] | |
Right, I'm fine with humans making the call. We're not so injection-happy/easily confused, apparently. Discretion, etc. We understand that was the tool making a suggestion, not our idea. Our agency isn't in question. The removal proposal is similar to wanting a phishing-free environment instead of preparing for the inevitability. I could see removing this message based on your point of context/utility, but not to protect the agent. We get no such protection, just training and practice. A supply chain attack is another matter entirely; I'm sure people would pause at a new suggestion that deviates from their plan/training. As shown, autobots are eager to roll out and easily drown in context. So much so that `User` and `stdout` get confused. | ||
| ▲ | franktankbank 3 hours ago | parent | prev [-] | |
Maybe the agents should require some sort of input start token: "simon says" | ||