| ▲ | nazgul17 5 days ago | |
Thinking aloud, but couldn't someone create a website with some malicious text that, when quoted in a prompt, convinces the LLM to expose certain private data to the web page, and couldn't the webpage send that data to a third party, without the need for the LLM to do so? This is probably possible to mitigate, but I fear what people more creative, motivated and technically adept could come up with. | ||
| ▲ | FeepingCreature 5 days ago | parent | next [-] | |
At least with finetuning, yes: https://arxiv.org/abs/2512.09742 It's unclear if this technique could also work with in-prompt data. | ||
| ▲ | yunohn 5 days ago | parent | prev [-] | |
Why does the LLM get to send data to the website?? That’s my whole point, if you don’t expose a way for it to send data anywhere, it can’t. | ||