Remix.run Logo
LegionMammal978 a year ago

Perhaps not, but the same input will lead to the same distribution of outputs, so all an attacker has to do is design something that works with reasonable probability on their end, and everyone else's instances of the LLM will automatically be vulnerable. The same way a pest or disease can devastate a population of cloned plants, even if each one grows slightly differently.

thaumasiotes a year ago | parent [-]

OK, but that's also the way attacking a bunch of individuals who can get tricked works.

zwnow a year ago | parent [-]

For tricking individuals your first got to contact them somehow. To trick an LLM you can just spam prompts.

thaumasiotes a year ago | parent [-]

You email them. It's called phishing.

throwaway314155 a year ago | parent | next [-]

Right and now there's a new vector for an old concept.

zwnow a year ago | parent | prev [-]

Employees usually know to not click on random shit they get sent. Most mails alrdy get filtered before they even reach the employee. Good luck actually achieving something with phishing mails.

thaumasiotes a year ago | parent [-]

When I was at NCC Group, we had a policy about phishing in penetration tests.

The policy was "we'll do it if the customer asks for it, but we don't recommend it, because the success rate is 100%".

bluefirebrand a year ago | parent [-]

How can you ever get that lower than 100% if you don't do the test to identify which employees need to be trained / monitored because they fall for phishing?