| ▲ | simonw 2 days ago |
| Yikes! Given the inherent threat of prompt injection, using the weakest available version of Gemini seems like a particularly bad idea. Not that even the strongest models are 100% effective at spotting prompt injection attacks, but they have way more of a fighting chance than Gemini nano does. |
|
| ▲ | tadfisher 2 days ago | parent | next [-] |
| You could contort the threat model such that prompt injection is something to worry about with a local model operating on local data and serving local results, sure. |
| |
| ▲ | BryantD 2 days ago | parent | next [-] | | I think the "local results" assumption is not completely accurate. This line: "You tell Gemini in Chrome what you want to get done, and it acts on web pages on your behalf, while you focus on other things" implies that the local agent will perform in-browser actions, which in theory enables data exfiltration. | | |
| ▲ | tadfisher 2 days ago | parent [-] | | This iteration of Gemini doesn't perform in-browser actions, but they did announce they'll ship an agent later. | | |
| ▲ | BryantD 2 days ago | parent [-] | | Yes. I agree that many of the announced and currently shipping features should be just fine from a security perspective with only a local agent. | | |
| ▲ | simonw 2 days ago | parent [-] | | Running an LLM locally makes no difference at all to the threat of malicious instructions that make it into the model causing unwanted actions or exfiltrating data. If anything a local LLM is more likely to have those problems because it's not as capable at detecting malicious tricks as a larger model. | | |
| ▲ | tadfisher 2 days ago | parent [-] | | This is the MCP problem, essentially, and the solution is the same: the user should review and approve specific actions before they are taken. Of course there will probably be a setting to auto-approve everything... |
|
|
|
| |
| ▲ | gruez 2 days ago | parent | prev [-] | | what if there's a prompt injection in some random web page that you visited? |
|
|
| ▲ | senordevnyc 2 days ago | parent | prev | next [-] |
| Hopefully they have it just returning a simple boolean result for whether page is suspicious, and no tool calling. |
| |
| ▲ | janalsncm 2 days ago | parent [-] | | If I had a tool which could determine whether a page was suspicious, why would I need an LLM to call it? |
|
|
| ▲ | janalsncm 2 days ago | parent | prev [-] |
| No system is 100% foolproof. If the baseline is “all malicious content gets through” and this method reduces it by 95% but that last 5% is using some sophisticated prompt injection, that’s not a “yikes” that’s a major win. At a technical level the risk isn’t from the size of the model but the fact that it is open weight and anyone can use it to create an adversarial payload. |
| |
| ▲ | simonw 2 days ago | parent [-] | | I disagree. In software security 95% is not a win - it's an invitation for users to trust a system that they shouldn't be trusting. |
|