Remix clone Hacker News

new | show | ask | jobs Github

	▲	yen223 2 hours ago
		There's a lot of overlap between the "disregard this" vulnerability among LLMs and social engineering vulnerabilities among humans. The mitigations are also largely the same, i.e. limit the blast radius of what a single compromised agent (LLM or human) can do
	▲	calpaterson 2 hours ago \| parent \| next [-]
		I agree and one of the things that makes it harder to handle "disregard that!" is that many models for LLM deployment involve positioning the agent centrally and giving it admin superpowers. I mention in the footnotes that I think that it makes more sense for the end-user of the LLM to be the one running it. That meshes with RBAC better (the user's LLM session only has the perms the user is actually entitled to) and doesn't devolve into praying the LLM says on-task.
	▲	zahlman an hour ago \| parent \| prev [-]
		It also seems to have a fair bit in common with SQL injection.