Remix.run Logo
anuramat 2 hours ago

> tweak badness enough

assuming you get to do gradient descent AND the context is fixed+known AND you have unlimited compute? sure; is it a realistic setup?

> the only way to fix ...

the exact same argument applies to any (sufficiently complex) piece of software, with exactly the same conclusion

also technically I'd argue that we do know the input/output space (set of all token strings of length <= N/token), and know the mapping (the model is a ~pure function in terms of the api, which is about as good of a representation as it gets for a non-invertible mapping); at least it's much closer than with something like linux

solid_fuel an hour ago | parent [-]

> assuming you get to do gradient descent AND the context is fixed+known AND you have unlimited compute? sure; is it a realistic setup?

Clearly nothing so complicated is required, given the prompt in the very article you are commenting on.

> the exact same argument applies to any (sufficiently complex) piece of software, with exactly the same conclusion

Yeah and the halting problem is hard too, but there's levels to this shit.

> also technically I'd argue that we do know the input/output space (set of all token strings of length <= N/token), and know the mapping (the model is a ~pure function in terms of the api, which is about as good of a representation as it gets for a non-invertible mapping); at least it's much closer than with something like linux

I would argue we don't even know the desired output for most inputs for an LLM and they certainly aren't trained on every possible input state. But I think Linux and LLMs are sufficient different that they aren't really directly comparable like this. After all, Linux is not a pure function and has lots of side effects.

But just to establish an order of magnitude: the input space for ChatGPT 3.0 was 2,048 tokens long. There were 50,257 tokens in the vocabulary. The input space thus has 50,257^(2048) unique states, which is approximately equal to 1.12 × 10^9628. That's an awful big input space for a single function.

anuramat an hour ago | parent [-]

> clearly nothing ... is required

this isn't even prompt injection; even if it was, how do you go from "exists" to "for all"?

> we don't know the desired output

then what are we talking about? if you don't know how you want your software to behave, how do you define a bug?

> linux is not a pure function ...

which is my point -- it's worse

> to establish an order of magnitude

and for linux?

solid_fuel 17 minutes ago | parent [-]

> this isn't even prompt injection; even if it was, how do you go from "exists" to "for all"?

Yes it is, and nice backtrack in the same sentence there. I've laid out plenty of evidence here so far, it's your turn to start thinking. We'll try the Socratic method.

Given that every LLM seen so far has been vulnerable to prompt injection attacks, what is your possible basis for thinking that one can be made immune from them? I'm going from "multiple attacks of this type exist for all know models, and the attacks exploit a known weakness in the design" to "therefore all LLMs are susceptible to this attack".

You're going from "an attack exists for all know models" to "it's definitely possible to build an LLM that is immune from this attack". That's a much larger leap, so show the logic backing your assertion.

> then what are we talking about? if you don't know how you want your software to behave, how do you define a bug?

You are the one asserting that input/output mappings existed for the entire space, not me.

>> linux is not a pure function ...

> which is my point -- it's worse

What, is this your first year in CS? No useful system can be a pure function. Side effects are work, if your function doesn't have a side effect, it does no work. Any system that uses an LLM to attempt work will have side effects - they may even include bombing an elementary school in Iran.

>> to establish an order of magnitude

> and for linux?

I've done all the thinking and all the research in this conversation so far, and I even specifically explained that you can't measure state space for a stateful function in a comparable way to a pure function. Clearly you didn't understand that, so if you want to force the comparison you can start adding up the state space for the linux kernel. Start with the spaces that are covered by tests, valid items include syscalls, registers, hardware interupts, etc.

Invalid spaces include doing something intentionally stupid like using the entire size of the ram or the space on the hard disk, since those are accessed on demand and not - like in an llm - all added together and fed into a blender everytime a syscall is made.