Remix.run Logo
wrs 4 hours ago

>Comments should be passed to the model with clear role boundaries that prevent them from being interpreted as system-level directives.

Well, such clear boundaries would solve lots of problems. But those don’t exist, do they?

mattalex 2 hours ago | parent | next [-]

You can get rid of 99.9% of those attacks by simply dispatching the data consumption to a different instance of the LLM, see, for instance, some of the later patterns in https://arxiv.org/abs/2506.08837

iqihs an hour ago | parent [-]

Thanks for the article link! Do you happen to know where to follow/read more articles like this for someone interested in getting more into AI security? Ty

InsideOutSanta 3 hours ago | parent | prev | next [-]

Yeah, I suspect the main reason this was rejected is simply because it's not fixable. This is just how LLMs work. This LLM ingests untrusted data, so there will always be a non-zero chance that this type of prompt injection succeeds.

chias 2 hours ago | parent | prev [-]

Ah yes - the cure for world hunger: eating food.