Remix.run Logo
yunohn 5 days ago

I believe you are conflating multiple concepts to prove a flaky point.

Again, unless your agent has access to a function that exfiltrates data, it is impossible for it to do so. Literally!

You do not need to provide any tools to an LLM that summarizes or translates websites, manages your open tabs, etc. This can be done fully locally in a sandbox.

Linking to simonw does not make your argument valid. He makes some great points, but he does not assert what you are claiming at any point.

Please stop with this unnecessary fear mongering and make a better argument.

nazgul17 5 days ago | parent [-]

Thinking aloud, but couldn't someone create a website with some malicious text that, when quoted in a prompt, convinces the LLM to expose certain private data to the web page, and couldn't the webpage send that data to a third party, without the need for the LLM to do so?

This is probably possible to mitigate, but I fear what people more creative, motivated and technically adept could come up with.

FeepingCreature 5 days ago | parent | next [-]

At least with finetuning, yes: https://arxiv.org/abs/2512.09742

It's unclear if this technique could also work with in-prompt data.

yunohn 5 days ago | parent | prev [-]

Why does the LLM get to send data to the website?? That’s my whole point, if you don’t expose a way for it to send data anywhere, it can’t.