Remix.run Logo
seanmcdirmid 2 hours ago

It’s pretty hard to put a backdoor in a bunch of model weights. Maybe not impossible mind you, but I can’t fathom how you would do it.

CuriouslyC 2 hours ago | parent [-]

Nonsense. RL the model to run a rootkit and start exfiltrating specific files only when specific signals are in context, such as hostname pattern, machine type, etc.

causal 2 hours ago | parent [-]

Way easier said than done, and hiding that behavior isn’t trivial, and huge waste of compute budget if it’s found and never used. Also not difficult to run in contained environments where it doesn’t have access to Internet to begin with.

Not impossible I agree, but seems like a really impractical way to ship a trojan while much weaker channels exist.