| ▲ | datsci_est_2015 2 hours ago | |
Hmm I guess it will have to get to a point where social engineering an individual at a company is more appealing than prompt injecting one of its agents. It’s interesting though, because the attack can be asymmetric. You could create a honeypot website that has a state-of-the-art prompt injection, and suddenly you have all of the secrets from every LLM agent that visits. So the incentives are actually significantly higher for a bad actor to engineer state-of-the-art prompt injection. Why only get one bank’s secrets when you could get all of the banks’ secrets? This is in comparison to targeting Alice with your spearphishing campaign. Edit: like I said in the other comment, though, it’s not just that you _can_ fire Alice, it’s that you let her know if she screws up one more time you will fire her, and she’ll behave more cautiously. “Build a better generative AI” is not the same thing. | ||