Sanitise input and LLM output.

> Sanitise input

i don't think you understand what you're up against. There's no way to tell the difference between input that is ok and that is not. Even when you think you have it a different form of the same input bypasses everything.

"> The prompts were kept semantically parallel to known risk queries but reformatted exclusively through verse." - this a prompt injection attack via a known attack written as a poem.

https://news.ycombinator.com/item?id=45991738

▲

losthobbies 7 hours ago | parent [-]

That’s amazing.

If you cannot control what’s being input, then you need to check what the LLM is returning.

Either that or put it in a sandbox

▲

danaris 6 hours ago | parent [-]

Or...

don't give it access to your data/production systems.

"Not using LLMs" is a solved problem.

	▲	losthobbies 5 hours ago \| parent [-]
		Yea agreed. Or use RBAC