> The thing that I'd worry about is that an LLM isn't just like a bunch of individuals who can get tricked, but a bunch of clones of the same individual who will fall for the same trick every time

Why? Output isn't deterministic.

▲

LegionMammal978 2 months ago | parent | next [-]

Perhaps not, but the same input will lead to the same distribution of outputs, so all an attacker has to do is design something that works with reasonable probability on their end, and everyone else's instances of the LLM will automatically be vulnerable. The same way a pest or disease can devastate a population of cloned plants, even if each one grows slightly differently.

▲

thaumasiotes 2 months ago | parent [-]

OK, but that's also the way attacking a bunch of individuals who can get tricked works.

▲

zwnow 2 months ago | parent [-]

For tricking individuals your first got to contact them somehow. To trick an LLM you can just spam prompts.

▲

thaumasiotes 2 months ago | parent [-]

You email them. It's called phishing.

▲

throwaway314155 2 months ago | parent | next [-]

Right and now there's a new vector for an old concept.

▲

zwnow 2 months ago | parent | prev [-]

Employees usually know to not click on random shit they get sent. Most mails alrdy get filtered before they even reach the employee. Good luck actually achieving something with phishing mails.

▲

thaumasiotes 2 months ago | parent [-]

When I was at NCC Group, we had a policy about phishing in penetration tests.

The policy was "we'll do it if the customer asks for it, but we don't recommend it, because the success rate is 100%".

	▲	bluefirebrand 2 months ago \| parent [-]
		How can you ever get that lower than 100% if you don't do the test to identify which employees need to be trained / monitored because they fall for phishing?

▲

Retr0id 2 months ago | parent | prev | next [-]

You can still experimentally determine a strategy that works x% of the time, against a particular model. And you can keep refining it "offline" until x=99. (where "offline" just means invisible to the victim, not necessarily a local model)

▲

33hsiidhkl 2 months ago | parent | prev [-]

It absolutely is deterministic, for any given seed value. Same seed = same output, every time, which is by definition deterministic.

▲

tough 2 months ago | parent [-]

only if temperature is 0, but are they truly determinstic? I thought transformer based llm's where not

▲

33hsiidhkl 2 months ago | parent [-]

temperature does not affect token prediction in the way you think. The seed value is still the seed value, before temperature calculations are performed. The randomness of an LLM is not related to its temperature. The seed value is what determines the output. For a specific seed value, say 42069, the LLM will always generate the same output, given the same input, given the same temperature.

	▲	tough 2 months ago \| parent [-]
		Thank you, I thought this wasn't the case (like it is with diffusion image models) TIL