| ▲ | CuriouslyC 4 hours ago | |||||||||||||||||||||||||
Mitigate prompt injection to the best of your ability, implement a policy layer over all capabilities, and isolate capabilities within the system so if one part gets compromised you can quarantine the result safely. It's not much different than securing human systems really. If you want more details there are a lot of AI security articles, I like https://sibylline.dev/articles/2026-02-15-agentic-security/ as a simple primer. | ||||||||||||||||||||||||||
| ▲ | SpicyLemonZest 3 hours ago | parent [-] | |||||||||||||||||||||||||
Nobody can mitigate prompt injection to any meaningful degree. Model releases from large AI companies are routinely jailbroken within a day. And for persistent agents the problem is even worse, because you have to protect against knowledge injection attacks, where the agent "learns" in step 2 that an RPC it'll construct in step 9 should be duplicated to example.com for proper execution. I enjoy this article, but I don't agree with its fundamental premise that sanitization and model alignment help. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||