Remix.run Logo
filearts 5 hours ago

It is fascinating how similar the prompt construction was to a phishing campaign in terms of characteristics.

  - Authority assertion
  - False urgency
  - Technical legitimacy
  - Security theater
Prompt injection here is like a phishing campaign against an entity with no consciousness or ability to stop and question through self-reflection.
XenophileJKO 3 hours ago | parent [-]

I'm fairly convinced that with the right training.. the ability of the LLM to be "skeptical" and resilient to these kinds of attacks will be pretty robust.

The current problem is that making the models resistant to "persona" injection is in opposition to much of how the models are also used conversationally. I think this is why you'll end up with hardened "agent" models and then more open conversational models.

I suppose it is also possible that the models can have an additional non-prompt context applied that sets expectations, but that requires new architecture for those inputs.

BarryMilo 3 hours ago | parent [-]

Isn't the whole problem that it's nigh-impossible to isolate context from input?