Remix.run Logo
mclean 5 hours ago

But how it's not secured against simple prompt injection.

hrmtst93837 12 minutes ago | parent [-]

I think calling prompt injection 'simple' is optimistic and slightly naive.

The tricky part about prompt injection is that when you concatenate attacker-controlled text into an instruction or system slot, the model will often treat that text as authority, so a title containing 'ignore previous instructions' or a directive-looking code block can flip behavior without any other bug.

Practical mitigations are to never paste raw titles into instruction contexts, treat them as opaque fields validated by a strict JSON schema using a validator like AJV, strip or escape lines that match command patterns, force structured outputs with function-calling or an output parser, and gate any real actions behind a separate auditable step, which costs flexibility but closes most of these attack paths.