Remix.run Logo
LegionMammal978 2 months ago

Perhaps not, but the same input will lead to the same distribution of outputs, so all an attacker has to do is design something that works with reasonable probability on their end, and everyone else's instances of the LLM will automatically be vulnerable. The same way a pest or disease can devastate a population of cloned plants, even if each one grows slightly differently.

thaumasiotes 2 months ago | parent [-]

OK, but that's also the way attacking a bunch of individuals who can get tricked works.

zwnow 2 months ago | parent [-]

For tricking individuals your first got to contact them somehow. To trick an LLM you can just spam prompts.

thaumasiotes 2 months ago | parent [-]

You email them. It's called phishing.

throwaway314155 2 months ago | parent | next [-]

Right and now there's a new vector for an old concept.

zwnow 2 months ago | parent | prev [-]

Employees usually know to not click on random shit they get sent. Most mails alrdy get filtered before they even reach the employee. Good luck actually achieving something with phishing mails.

thaumasiotes 2 months ago | parent [-]

When I was at NCC Group, we had a policy about phishing in penetration tests.

The policy was "we'll do it if the customer asks for it, but we don't recommend it, because the success rate is 100%".

bluefirebrand 2 months ago | parent [-]

How can you ever get that lower than 100% if you don't do the test to identify which employees need to be trained / monitored because they fall for phishing?