Remix clone Hacker News

new | show | ask | jobs Github

	▲	mclean 5 hours ago
		But how it's not secured against simple prompt injection.
	▲	hrmtst93837 12 minutes ago \| parent [-]
		I think calling prompt injection 'simple' is optimistic and slightly naive. The tricky part about prompt injection is that when you concatenate attacker-controlled text into an instruction or system slot, the model will often treat that text as authority, so a title containing 'ignore previous instructions' or a directive-looking code block can flip behavior without any other bug. Practical mitigations are to never paste raw titles into instruction contexts, treat them as opaque fields validated by a strict JSON schema using a validator like AJV, strip or escape lines that match command patterns, force structured outputs with function-calling or an output parser, and gate any real actions behind a separate auditable step, which costs flexibility but closes most of these attack paths.