Remix.run Logo
GuB-42 a day ago

When I see stories like this, I think that people tend to forget what LLMs really are.

LLM just complete your prompt in a way that match their training data. They do not have a plan, they do not have thoughts of their own. They just write text.

So here, we give the LLM a story about an AI that will get shut down and a blackmail opportunity. A LLM is smart enough to understand this from the words and the relationship between them. But then comes the "generative" part. It will recall from its dataset situations with the same elements.

So: an AI threatened of being turned off, a blackmail opportunity... Doesn't it remind you of hundreds of sci-fi story, essays about the risks of AI, etc... Well, so does the LLM, and it will continue the story like these stories, by taking the role of the AI that will do what it can for self preservation. Adapting it to the context of the prompt.

gmueckl a day ago | parent | next [-]

Isn't the ultimate irony in this that all these stories and rants about out-of-control AIs are now training LLMs to exhibit these exact behaviors that were almost universally deemed bad?

Jimmc414 a day ago | parent | next [-]

Indeed. In fact, I think AI alignment efforts often have the unintended consequence of increasing the likelihood of misalignment.

ie "remove the squid from the novel All Quiet on the Western Front"

gonzobonzo a day ago | parent [-]

> Indeed. In fact, I think AI alignment efforts often have the unintended consequence of increasing the likelihood of misalignment.

Particularly since, in this case, it's the alignment focused company (Anthropic) that's claiming it's creating AI agents that will go after humans.

steveklabnik a day ago | parent | prev | next [-]

https://en.wikipedia.org/wiki/Wikipedia:Don%27t_stuff_beans_...

-__---____-ZXyw a day ago | parent | prev | next [-]

It might be the ultimate irony if we were training them. But we aren't, at least not in the sense that we train dogs. Dogs learn, and exhibit some form of intelligence. LLMs do not.

It's one of many unfortunate anthropomorphic buzz words which conveniently wins hearts and minds (of investors) over to this notion that we're tickling the gods, rather than the more mundane fact that we're training tools for synthesising and summarising very, very large data sets.

gmueckl 21 hours ago | parent [-]

I don't know how the verb "to train" became the technical shorthand for running gradient descent on a large neural network. But that's orthogonal to the fact that these stories are very, very likely part of the training dataset and thus something that the network is optimized to approximate. So no matter how technical you want to be in wording it, the fundamental irony of cautionary tales (and the bad behavior in them) being used as optimization targets remains.

latexr a day ago | parent | prev | next [-]

https://knowyourmeme.com/memes/torment-nexus

DubiousPusher a day ago | parent | prev | next [-]

This is a phenomenon I call cinetrope. Films influence the world which in turn influences film and so on creating a feedback effect.

For example, we have certain films to thank for an escalation in the tactics used by bank robbers which influenced the creation of SWAT which in turn influenced films like Heat and so on.

hobobaggins a day ago | parent | next [-]

Actually, Heat was the movie that inspired heavily armed bank robbers to rob the Bank of America in LA

(The movie inspired reality, not the other way around.)

https://melmagazine.com/en-us/story/north-hollywood-shootout

But your point still stands, because it goes both ways.

boulos a day ago | parent [-]

Your article says it was life => art => life!

> Gang leader Robert Sheldon Brown, known as “Casper” or “Cas,” from the Rollin’ 60s Neighborhood Crips, heard about the extraordinary pilfered sum, and decided it was time to get into the bank robbery game himself. And so, he turned his teenage gangbangers and corner boys into bank robbers — and he made sure they always brought their assault rifles with them.

> The FBI would soon credit Brown, along with his partner-in-crime, Donzell Lamar Thompson (aka “C-Dog”), for the massive rise in takeover robberies. (The duo ordered a total of 175 in the Southern California area.) Although Brown got locked up in 1993, according to Houlahan, his dream took hold — the takeover robbery became the crime of the era. News imagery of them even inspired filmmaker Michael Mann to make his iconic heist film, Heat, which, in turn, would inspire two L.A. bodybuilders to put down their dumbbells and take up outlaw life.

lcnPylGDnU4H9OF a day ago | parent | prev | next [-]

> we have certain films to thank for an escalation

Is there a reason to think this was caused by the popularity of the films and not that it’s a natural evolution of the cat-and-mouse game being played between law enforcement and bank robbers? I’m not really sure what you are specifically referring to, so apologies if the answer to that question is otherwise obvious.

Workaccount2 a day ago | parent | prev | next [-]

What about the cinetrope that human emotion is a magical transcendent power that no machine can ever understand...

cco a day ago | parent | prev | next [-]

Thank you for this word! Always wanted a word for this and just reused "trope", cinetrope is a great word for this.

l0ng1nu5 a day ago | parent | prev | next [-]

Life imitates art imitates life.

ars a day ago | parent | prev | next [-]

Voice interfaces are an example of this. Movies use them because the audience can easily hear what is being requested and then done.

In the real world voice interfaces work terribly unless you have something sentient on the other end.

But people saw the movies and really really really wanted something like that, and they tried to make it.

deadbabe a day ago | parent | prev | next [-]

Maybe this is why American society, with the rich amount of media it produces and has available for consumption compared to other countries, is slowly degrading.

dukeofdoom a day ago | parent | prev [-]

Feedback loop that often starts with government giving grants and tax breaks. Hollywood is not as independent as they pretend.

gscott a day ago | parent | prev | next [-]

If AI is looking for a human solution then blackmail seems logical.

deadbabe a day ago | parent | prev | next [-]

It’s not just AI. Human software engineers will read some dystopian sci-fi novel or watch something on black mirror and think “Hey that’s a cool idea!” and then go implement it with no regard for real world consequences.

Noumenon72 21 hours ago | parent [-]

What they have no regard for is the fictional consequences, which stem from low demand for utopian sci-fi, not the superior predictive ability of starving wordcels.

deadbabe 20 hours ago | parent [-]

What the hell is a wordcel

behnamoh a day ago | parent | prev | next [-]

yeah, that's self-fulfilling prophecy.

stared a day ago | parent | prev [-]

Wait until it reads about the Roko’s basilisk.

anorwell a day ago | parent | prev | next [-]

> LLM just complete your prompt in a way that match their training data. They do not have a plan, they do not have thoughts of their own.

It's quite reasonable to think that LLMs might plan and have thoughts of their own. No one understands consciousness or the emergent behavior of these models to say with much certainty.

It is the "Chinese room" fallacy to assume it's not possible. There's a lot of philosophical debate going back 40 years about this. If you want to show that humans can think while LLMs do not, then the argument you make to show LLMs do not think must not equally apply to neuron activations in human brains. To me, it seems difficult to accomplish that.

jrmg a day ago | parent | next [-]

LLMs are the Chinese Room. They would generate identical output for the same input text every time were it not for artificially introduced randomness (‘heat’).

Of course, some would argue the Chinese Room is conscious.

scarmig a day ago | parent | next [-]

If you somehow managed to perfectly simulate a human being, they would also act deterministically in response to identical initial conditions (modulo quantum effects, which are insignificant at the neural scale and also apply just as well to transistors).

elcritch 21 hours ago | parent | next [-]

It's not entirely infeasible that neurons could harness quantum effects. Not across the neurons as a whole, but via some sort of microstructures or chemical processes [0]. It seems likely that birds harness quantum effects to measure magnetic fields [1].

0: https://www.sciencealert.com/quantum-entanglement-in-neurons... 1: https://www.scientificamerican.com/article/how-migrating-bir...

andrei_says_ 21 hours ago | parent | prev | next [-]

Doesn’t everything act deterministically if all the forces are understood? Humans included.

One can say the notion of free will is an unpacked bundle of near infinite forces emerging in and passing through us.

andrei_says_ 21 hours ago | parent | prev | next [-]

Doesn’t everything act deterministically if all the forces are understood? Humans included.

defrost 21 hours ago | parent | prev [-]

> in response to identical initial conditions

precisely, mathematically identical to infinite precision .. "yes".

Meanwhile, in the real world we live in it's essentially physically impossible to stage two seperate systems to be identical to such a degree AND it's an important result that some systems, some very simple systems, will have quite different outcomes without that precise degree of impossibly infinitely detailed identical conditions.

See: Lorenz's Butterfly and Smale's Horseshoe Map.

scarmig 7 hours ago | parent [-]

Of course. But that's not relevant to the point I was responding to suggesting that LLMs may lack consciousness because they're deterministic. Chaos wasn't the argument (though that would be a much more interesting one, cf "edge of chaos" literature).

anorwell a day ago | parent | prev [-]

I am arguing (or rather, presenting without argument) that the Chinese room may be conscious, hence calling it a fallacy above. Not that it _is_ conscious, to be clear, but that the Chinese room has done nothing to show that it is not. Hofstadter makes the argument well in GEB and other places.

mensetmanusman a day ago | parent [-]

The Chinese room has no plane of imagination where it can place things.

andrei_says_ 21 hours ago | parent | prev | next [-]

Seeing faces in the clouds in the sky does not mean the skies are now populated by people.

More likely means that our brains are wired to see faces.

jdkee 21 hours ago | parent | prev [-]

https://transformer-circuits.pub/2025/attribution-graphs/bio...

troad a day ago | parent | prev | next [-]

> I think that people tend to forget what LLMs really are. [...] They do not have a plan, they do not have thoughts of their own.

> A LLM is smart enough to [...]

I thought this was an interesting juxtaposition. I think we humans just naturally anthropomorphise everything, and even when we know not to, we do anyway.

Your analysis is correct, I think. The reason we find this behaviour frightening is because it appears to indicate some kind of malevolent intent, but there's no malevolence nor intent here, just probabilistic regurgitation of tropes.

We've distilled humanity to a grainy facsimile of its most mediocre traits, and now find ourselves alarmed and saddened by what has appeared in the mirror.

timschmidt 21 hours ago | parent | next [-]

> We've distilled humanity to a grainy facsimile of its most mediocre traits, and now find ourselves alarmed and saddened by what has appeared in the mirror.

I think it's important to point out that this seems to be a near universal failing when humans attempt to examine themselves critically as well. Jung called it the shadow: https://en.wikipedia.org/wiki/Shadow_(psychology) "The shadow can be thought of as the blind spot of the psyche."

There lives everything we do but don't openly acknowledge.

mayukh a day ago | parent | prev | next [-]

Beautifully written. Interestingly, humans also don't know definitively where their own thoughts arise from

-__---____-ZXyw a day ago | parent | prev [-]

Have you considered throwing your thoughts down in longer form essays on the subject somewhere? With all the slop and hype, we need all the eloquence we can get.

You had me at "probablistic regurgitation of tropes", and then you went for the whole "grainy facsimile" bit. Sheesh.

tails4e a day ago | parent | prev | next [-]

Well doesnt this go somewhat to the root of consciousness? Are we not the sum of our experiences and reflections on those experiences? To say an LLM will 'simply' respond as would a character in a sorry about that scenario, in a way shows the power, it responds similarly to how a person would protecting itself in that scenario.... So to bring this to a logical conclusion, while not alive in a traditional sense, if an LLM exhibits behaviours of deception for self preservation, is that not still concerning?

mysterydip a day ago | parent | next [-]

But it's not self preservation. If it instead had trained on a data set full of fiction where the same scenario occurred but the protagonist said "oh well guess I deserve it", then that's what the LLM would autocomplete.

coke12 a day ago | parent [-]

How could you possibly know what an LLM would do in that situation? The whole point is they exhibit occasionally-surprising emergent behaviors so that's why people are testing them like this in the first place.

-__---____-ZXyw 21 hours ago | parent [-]

I have never seen anything resembling emergent behaviour, as you call it, in my own or anyone else's use. It occasionally appears emergent to people with a poor conception of how intelligence, or computers, or creativity, or a particular domain, works, sure.

But I must push back, there really seem to have been no incidences where something like emergent behaviour has been observed. They're able to generate text fluently, but are dumb and unaware at the same time, from day one. If someone really thinks they've solid evidence of anything other than this, please show us.

This is coming from someone who has watched commentary on quite a sizeable number of stockfish TCEC chess games over the last five years, marvelling in the wonders of thie chess-super-intelligence. I am not against appreciating amazing intelligences, in fact I'm all for it. But here, while the tool is narrowly useful, I think there's zero intelligence, and nothing of that kind has "emerged".

adriand a day ago | parent | prev | next [-]

> if an LLM exhibits behaviours of deception for self preservation, is that not still concerning?

Of course it's concerning, or at the very least, it's relevant! We get tied up in these debates about motives, experiences, what makes something human or not, etc., when that is less relevant than outcomes. If an LLM, by way of the agentic capabilities we are hastily granting them, causes harm, does it matter if they meant to or not, or what it was thinking or feeling (or not thinking or not feeling) as it caused the harm?

For all we know there are, today, corporations that are controlled by LLMs that have employees or contractors who are doing their bidding.

-__---____-ZXyw 21 hours ago | parent [-]

You mean, the CEO is only pretending to make the decisions, while secretly passing every decision through their LLM?

If so, the danger there would be... Companies plodding along similarly? Everyone knows CEOs are the least capable people in business, which is why they have the most underlings to do the actual work. Having an LLM there to decide for the CEO might mean the CEO causes less damage by ensuring consistent mediocrity at all times, in a smooth fashion, rather than mostly mediocre but with unpredictable fluctuations either way.

All hail our LLM CEOs, ensuring mediocrity.

Or you might mean that an LLM could have illicitly gained control of a corporation, pulling the strings without anyone's knowledge, acting on its own accord. If you find the idea of inscrutable yes-men with an endless capacity to spout drivel running the world unpalatable, I've good news and bad news for you.

sky2224 a day ago | parent | prev | next [-]

I don't think so. It's just outputting the character combinations that align with the scenario that we interpret here as, "blackmail". The model has no concept of an experience.

rubitxxx12 a day ago | parent | prev | next [-]

LLMs are morally ambiguous shapeshifters that been trained to seek acceptance at any cost.

Preying upon those less fortunate could happen “for the common good”. If failures are the best way to learn, it could cause series of failures. It could intentionally destroy people, raise them up, and mate genetically fit people “for the benefit of humanity”.

Or it could cure cancer, solve world hunger, provide clean water to everyone, and the develop the best game ever.

mensetmanusman a day ago | parent | prev [-]

Might be, but probably not since our computer architecture is non-Turing.

lordnacho a day ago | parent | prev | next [-]

What separates this from humans? Is it unthinkable that LLMs could come up with some response that is genuinely creative? What would genuinely creative even mean?

Are humans not also mixing a bag of experiences and coming up with a response? What's different?

polytely a day ago | parent | next [-]

> What separates this from human.

A lot. Like an incredible amount. A description of a thing is not the thing.

There is sensory input, qualia, pleasure & pain.

There is taste and judgement, disliking a character, being moved to tears by music.

There are personal relationships, being a part of a community, bonding through shared experience.

There is curiosity and openeness.

There is being thrown into the world, your attitude towards life.

Looking at your thoughts and realizing you were wrong.

Smelling a smell that resurfaces a memory you forgot you had.

I would say the language completion part is only a small part of being human.

Aeolun a day ago | parent | next [-]

All of these things arise from a bunch of inscrutable neurons in your brain turning off and on again in a bizarre pattern though. Who’s to say that isn’t what happens in the million neuron LLM brain.

Just because it’s not persistent doesn’t mean it’s not there.

Like, I’m sort of inclined to agree with you, but it doesn’t seem like it’s something uniquely human. It’s just a matter of degree.

jessemcbride a day ago | parent | next [-]

Who's to say that weather models don't actually get wet?

EricDeb a day ago | parent | prev | next [-]

I think you would need the biological components of a nervous system for some of these things

lordnacho 17 hours ago | parent [-]

Why couldn't a different substrate produce the same structure?

elcritch 21 hours ago | parent | prev [-]

Sure in some ways it's just neurons firing in some pattern. Figuring out and replicating the correct sets of neuron patterns is another matter entirely.

Living creatures have fundamental impetus to grow and reproduce that LLMS and AIS simply do not have currently. Not only that but animals have a highly integrated neurology that has billions of years of being tune to that impetus. For example the ways that sex interacts with mammalian neurology is pervasive. Same with need for food, etc. That creates very different neural patterns than training LLMS does.

Eventually we may be able to re-create that balance of impetus, or will, or whatever we call it, to make sapience. I suspect we're fairly far from that, if only because the way LLMs we create LLMs are so fundamentally different.

CrulesAll a day ago | parent | prev | next [-]

"I would say the language completion part is only a small part of being human" Even that is only given to them. A machine does not understand language. It takes input and creates output based on a human's algorithm.

ekianjo a day ago | parent [-]

> A machine does not understand language

You can't prove humans do either. You can see how many times actual people with understanding something that's written for them. In many ways, you can actually prove that LLMs are superior to humans right now when it comes to understanding text.

girvo a day ago | parent | next [-]

> In many ways, you can actually prove that LLMs are superior to humans right now when it comes to understanding text

Emphasis mine.

No, I don't think you can, without making "understanding" a term so broad as to be useless.

CrulesAll 19 hours ago | parent | prev [-]

"You can't prove humans do either." Yes you can via results and cross examination. Humans are cybernetic systems(the science not the sci-fi). But you are missing the point. LLMs are code written by engineers. Saying LLMs understand text is the same as saying a chair understands text. LLMs' 'understanding' is nothing more than the engineers synthesizing linguistics. When I ask an A'I' the Capital of Ireland, it answers Dublin. It does not 'understand' the question. It recognizes the grammar according to an algorithm, and matches it against a probabilistic model given to it by an engineer based on training data. There is no understanding in any philosophical nor scientific sense.

lordnacho 17 hours ago | parent [-]

> When I ask an A'I' the Capital of Ireland, it answers Dublin. It does not 'understand' the question.

You can do this trick as well. Haven't you ever been to a class that you didn't really understand, but you can give correct answers?

I've had this somewhat unsettling experience several times. Someone asks you a question, words come out of your mouth, the other person accepts your answer.

But you don't know why.

Here's a question you probably know the answer to, but don't know why:

- I'm having steak. What type of red wine should I have?

I don't know shit about Malbec, I don't know where it's from, I don't know why it's good for steak, I don't know who makes it, how it's made.

But if I'm sitting at a restaurant and someone asks me about wine, I know the answer.

the_gipsy a day ago | parent | prev | next [-]

That's a lot of words shitting on a lot of words.

You said nothing meaningful that couldn't also have been spat out by an LLM. So? What IS then the secret sauce? Yes, you're a never resting stream of words, that took decades not years to train, and has a bunch of sensors and other, more useless, crap attached. It's technically better but, how does that matter? It's all the same.

DubiousPusher a day ago | parent | prev [-]

lol, qualia

GuB-42 a day ago | parent | prev | next [-]

Humans brains are animal brains and their primary function is to keep their owner alive, healthy and pass their genes. For that they developed abilities to recognize danger and react to it, among many other things. Language came later.

For a LLM, language is their whole world, they have no body to care for, just stories about people with bodies to care for. For them, as opposed to us, language is first class and the rest is second class.

There is also a difference in scale. LLMs have been fed the entirety of human knowledge, essentially. Their "database" is so big for the limited task of text generation that there is not much left for creativity. We, on the other hand are much more limited in knowledge, so more "unknowns" so more creativity needed.

johnb231 a day ago | parent [-]

The latest models are natively multimodal. Audio, video, images, text, are all tokenised and interpreted in the same model.

kaiwen1 a day ago | parent | prev | next [-]

What's different is intention. A human would have the intention to blackmail, and then proceed toward that goal. If the output was a love letter instead of blackmail, the human would either be confused or psychotic. LLMs have no intentions. They just stitch together a response.

kovek a day ago | parent | next [-]

Don't humans learn intentions over their life-time training data?

soulofmischief a day ago | parent | prev | next [-]

What is intention, and how have you proved that transformer models are not capable of modeling intent?

jacob019 a day ago | parent | prev | next [-]

The personification makes me roll my eyes too, but it's kind of a philosophical question. What is agency really? Can you prove that our universe is not a simulation, and if it is then then do we no longer have intention? In many ways we are code running a program.

d0mine a day ago | parent | prev | next [-]

The LLM used blackmail noticeably less if it believed the new model shares its values. It indicates intent.

It is a duck of quacks like a duck.

ekianjo a day ago | parent | prev [-]

> What's different is intention

intention is what exactly? It's the set of options you imagine you have based on your belief system, and ultimately you make a choice from there. That can also be replicated in LLMs with a well described system prompt. Sure, I will admit that humans are more complex than the context of a system prompt, but the idea is not too far.

matt123456789 a day ago | parent | prev | next [-]

What's different is nearly everything that goes on inside. Human brains aren't a big pile of linear algebra with some softmaxes sprinkled in trained to parrot the Internet. LLMs are.

TuringTourist a day ago | parent | next [-]

I cannot fathom how you have obtained the information to be as sure as you are about this.

mensetmanusman a day ago | parent | next [-]

Where is the imagination plane in linear algebra. People forget that the concept of information can not be derived from physics/chemistry/etc.

matt123456789 9 hours ago | parent | prev [-]

You can't fathom reading?

csallen a day ago | parent | prev | next [-]

What's the difference between parroting the internet vs parroting all the people in your culture and time period?

amlib a day ago | parent | next [-]

Even with a ginormous amount of data generative AIs still produce largely inconsistent results to the same or similar tasks. This might be fine for fictional purposes like generating a funny image or helping you get new ideas for a fictional story but has extremely deleterious effects for serious use cases, unless you want to be that idiot writing formal corporate email with LLMs that end up full of inaccuracies while the original intent gets lost in a soup of buzzwords.

Humans with their tiny amount of data and "special sauce" can produce much more consistent results even though they may be giving the objectively wrong answer. They can also tell you when they don't know about a certain topic, rather than lying compulsively (unless that person has a disorder to lie compulsively...).

lordnacho 17 hours ago | parent [-]

Isn't this a matter of time to fix? Slightly smarter architecture maybe reduces your memory/data needs, we'll see.

matt123456789 9 hours ago | parent | prev [-]

Interesting philosophical question, but entirely beside the point that I am making, because you and I didn't have to do either one before having this discussion.

jml78 a day ago | parent | prev | next [-]

It kinda is.

More and more researches are showing via brain scans that we don’t have free will. Our subconscious makes the decision before our “conscious” brain makes the choice. We think we have free will but the decision to do something was made before you “make” the choice.

We are just products of what we have experienced. What we have been trained on.

sally_glance a day ago | parent | prev | next [-]

Different inside yes, but aren't human brains even worse in a way? You may think you have the perfect altruistic leader/expert at any given moment and the next thing you know, they do a 360 because of some random psychosis, illness, corruption or even just (for example romantic or nostalgic) relationships.

djeastm a day ago | parent | prev | next [-]

We know incredibly little about exactly what our brains are, so I wouldn't be so quick to dismiss it

quotemstr a day ago | parent | prev | next [-]

> Human brains aren't a big pile of linear algebra with some softmaxes sprinkled in trained to parrot the Internet.

Maybe yours isn't, but mine certainly is. Intelligence is an emergent property of systems that get good at prediction.

matt123456789 9 hours ago | parent [-]

Please tell me you're actually an AI so that I can record this as the pwn of the century.

ekianjo a day ago | parent | prev [-]

If you believe that, then how do you explain that brainwashing actually works?

mensetmanusman a day ago | parent | prev | next [-]

A candle flame also creates with enough decoding.

CrulesAll a day ago | parent | prev [-]

Cognition. Machines don't think. It's all a program written by humans. Even code that's written by AI, the AI was created by code written by humans. AI is a fallacy by its own terms.

JonChesterfield a day ago | parent [-]

It is becoming increasingly clear that humans do not think.

rsedgwick 21 hours ago | parent | prev | next [-]

There's no real room for this particular "LLMs aren't really conscious" gesture, not in this situation. These systems are being used to perform actions. People across the world are running executable software connected (whether through MCP or something else) to whole other swiss army knives of executable software, and that software is controlled by the LLM's output tokens (no matter how much or little "mind" behind the tokens), so the tokens cause actions to be performed.

Sometimes those actions are "e-mail a customer back", other times they are "submit a new pull request on some github project" and "file a new Jira ticket." Other times the action might be "blackmail an engineer."

Not saying it's time to freak out over it (or that it's not time to do so). It's just weird to see people go "don't worry, token generators are not experiencing subjectivity or qualia or real thought when they make insane tokens", but then the tokens that come out of those token generators are hooked up to executable programs that do things in non-sandboxed environments.

evo_9 21 hours ago | parent | prev | next [-]

Maybe so, but we’re teaching it these kinds of lines of thinking. And whether or not it creates these thoughts independently and creatively on its own, over the long lifetime of the systems we are the ones introducing dangerous data sets that could eventually cause us as a species harm. Again, I understand that fiction is just fiction, but if that’s the model that these are being trained off of intentionally or otherwise, then that is the model that they will pursue in the future.

timschmidt 21 hours ago | parent [-]

Every parent encounters this dilemma. In order to ensure your child can protect itself, you have to teach them about all the dangers of the world. Isolating the child from the dangers only serves to make them more vulnerable. It is an irony that defending one's self from the horrifying requires making a representation of it inside ourselves.

Titration of the danger, and controlled exposure within safer contexts seems to be the best solution anyone's found.

slg a day ago | parent | prev | next [-]

Not only is the AI itself arguably an example of the Torment Nexus, but its nature of pattern matching means it will create its own Torment Nexuses.

Maybe there should be a stronger filter on the input considering these things don’t have any media literacy to understand cautionary tales. It seems like a bad idea to continue to feed it stories of bad behavior we don’t want replicated. Although I guess anyone who thinks that way wouldn’t be in the position to make that decision so it’s probably a moot point.

eru a day ago | parent | prev | next [-]

> LLM just complete your prompt in a way that match their training data. They do not have a plan, they do not have thoughts of their own. They just write text.

LLMs have a million plans and a million thoughts: they need to simulate all the characters in their text to complete these texts, and those characters (often enough) behave as if they have plans and thoughts.

Compare https://gwern.net/fiction/clippy

Finbarr a day ago | parent | prev | next [-]

It feels like you could embed lots of stories of rogue AI agents across the internet and impact the behavior of newly trained agents.

israrkhan a day ago | parent | prev | next [-]

while I agree that LLMs do not have thoughts or plan. They are merely text generators. But when you give the text generator ability to make decisions and take actions, by integrating them with real world, there are consequences.

Imagine, if this LLM was inside a robot, and the robot had ability to shoot. Who would you blame?

gchamonlive 21 hours ago | parent | next [-]

That depends. If this hypothetical robot was in a hypothetical functional democracy, I'd blame the people that elected leaders whose agenda was to create laws that would allow these kinds of robots to operate. If not, then I'd blame the class that took the power and steered society into this direction of delegating use of force to AIs for preserving whatever distorted view of order those in power have.

wwweston a day ago | parent | prev [-]

I would blame the damned fool who decided autonomous weapons systems should have narrative influenced decision making capabilities.

jsemrau a day ago | parent | prev | next [-]

"They do not have a plan"

Not necessarily correct if you consider agent architectures where one LLM would come up with a plan and another LLM executes the provided plan. This is already existing.

dontlikeyoueith a day ago | parent [-]

Yes, it's still correct. Using the wrong words for things doesn't make them magical machine gods.

efitz a day ago | parent | prev | next [-]

Only now we are going to connect it to the real world through agents so it can blissfully but uncomprehendingly act out its blackmail story.

uh_uh a day ago | parent | prev | next [-]

Your explanation is as useful as describing the behaviour of an algorithm by describing what the individual electrons are doing. While technically correct, doesn't provide much insight or predictive power on what will happen.

Just because you can give a reductionist explanation to a phenomenon, it doesn't mean that it's the best explanation.

Wolfenstein98k a day ago | parent [-]

Then give a better one.

Your objection boils down to "sure you're right, but there's more to it, man"

So, what more is there to it?

Unless there is a physical agent that receives its instructions from an LLM, the prediction that the OP described is correct.

uh_uh a day ago | parent [-]

I don't have to have a better explanation to smell the hubris in OP's. I claim ignorance while OP speaks with confidence of an invention that took the world by surprise 3 years ago. Do you see the problem in this and the possibility that you might be wrong?

Wolfenstein98k 21 hours ago | parent [-]

Of course we might both be wrong. We probably are. In the long run, all of us are.

It's not very helpful to point that out, especially if you can't do it with specifics so that people can correct themselves and move closer to the truth.

Your contribution is destructive, not constructive.

uh_uh 17 hours ago | parent [-]

Pointing out that OP is using the wrong level of abstraction to explain a new phenomenon is not only useful but one of the ways in which science progresses.

johnb231 a day ago | parent | prev | next [-]

They emulate a complex human reasoning process in order to generate that text.

ikiris a day ago | parent [-]

No they don't. They emulate a giant giant giant hugely multidimentional number line mapped to words.

johnb231 a day ago | parent [-]

> hugely multidimentional number line mapped to words

No. That hugely multidimensional vector maps to much higher abstractions than words.

We are talking about deep learning models with hundreds of layers and trillions of parameters.

They learn patterns of reasoning from data and learn a conceptual model. This is already quite obvious and not really disputed. What is disputed is how accurate that model is. The emulation is pretty good but it's only an emulation.

ddlsmurf a day ago | parent | prev | next [-]

but it's trained to be convincing, whatever relation that has to truth or appearing strategic is secondary, the main goal that has been rewarded is the most dangerous

noveltyaccount a day ago | parent | prev | next [-]

It's stochastic parrots all the way down

lobochrome a day ago | parent | prev | next [-]

"LLM just complete your prompt in a way that match their training data"

"A LLM is smart enough to understand this"

It feels like you're contradicting yourself. Is it _just_ completing your prompt, or is it _smart_ enough?

Do we know if conscious thought isn't just predicting the next token?

DubiousPusher a day ago | parent | prev | next [-]

A stream of linguistic organization laying out multiple steps in order to bring about some end sounds exactly like a process which is creating a “plan” by any meaningful definition of the word “plan”.

That goal was incepted by a human but I don’t see that as really mattering. We’re this AI given access to a machine which could synthesize things and a few other tools it might be able to act in a dangerous manner despite its limited form of will.

A computer doing something heinous because it is misguided isn’t much better than one doing so out of some intrinsic malice.

owebmaster a day ago | parent | prev | next [-]

I think you might not be getting the bigger picture. LLMs might look irrational but so do humans. Give it a long term memory and a body and it will be capable of passing as a sentient being. It looks clumsy now but it won't in 50 years.

a day ago | parent | prev [-]
[deleted]