Remix.run Logo
nazgul17 5 days ago

Thinking aloud, but couldn't someone create a website with some malicious text that, when quoted in a prompt, convinces the LLM to expose certain private data to the web page, and couldn't the webpage send that data to a third party, without the need for the LLM to do so?

This is probably possible to mitigate, but I fear what people more creative, motivated and technically adept could come up with.

FeepingCreature 5 days ago | parent | next [-]

At least with finetuning, yes: https://arxiv.org/abs/2512.09742

It's unclear if this technique could also work with in-prompt data.

yunohn 5 days ago | parent | prev [-]

Why does the LLM get to send data to the website?? That’s my whole point, if you don’t expose a way for it to send data anywhere, it can’t.