> the biggest take away I have is, if you tell it "don't do xyz" it will always have in the back of its mind "do xyz" and any chance it gets it will take to "do xyz"

You're absolutely right! This can actually extend even to things like safety guardrails. If you tell or even train an AI to not be Mecha-Hitler, you're indirectly raising the probability that it might sometimes go Mecha-Hitler. It's one of many reasons why genuine "alignment" is considered a very hard problem.

▲

jonfw 3 days ago | parent | next [-]

This reminds me of a phenomena in motorcyling called "target fixation".

If you are looking at something, you are more likely to steer towards it. So it's a bad idea to focus on things you don't want to hit. The best approach is to pick a target line and keep the target line in focus at all times.

I had never realized that AIs tend to have this same problem, but I can see it now that it's been mentioned! I have in the past had to open new context windows to break out of these cycles.

▲

hinkley 2 days ago | parent | next [-]

Mountain bikers taught me about this back when it was a new sport. Don’t look at the tree stump.

Children are particularly terrible about this. We needed up avoiding the brand new cycling trails because the children were worse hazards than dogs. You can’t announce you’re passing a child on a bike. You just have to sneak past them or everything turns dangerous immediately. Because their arms follow their neck and they will try to look over their shoulder at you.

▲

brookst 3 days ago | parent | prev [-]

Also in racing and parachuting. Look where you want to go. Nothing else exists.

	▲	SoftTalker 3 days ago \| parent \| next [-]
		Or just driving. For example you are entering a curve in the road, look well ahead at the center of your lane, ideally at the exit of the curve if you can see it, and you'll naturally negotiate it smoothly. If you are watching the edge of the road, or the center line, close to the car, you'll tend to drift that way and have to make corrective steering movements while in the curve, which should be avoided.
	▲	cruffle_duffle 3 days ago \| parent \| prev [-]
		Same with FPV quadcopter flying. Focus on the line you want to fly.

▲

elcritch 3 days ago | parent | prev | next [-]

Given how LLMs work it makes sense that mentioning a topic even to negate it still adds that locus of probabilities to its attention span. Even humans are prone to being affected by it as it's a well known rhetorical device [1].

Then any time the probability chains for some command approaches that locus it'll fall into it. Very much like chaotic attractors come to think of it. Makes me wonder if there's any research out there on chaos theory attractors and LLM thought patterns.

1: https://en.wikipedia.org/wiki/Apophasis

	▲	dreamcompiler 3 days ago \| parent [-]
		Well, all LLMs have nonlinear activation functions (because all useful neural nets require nonlinear activation functions) so I think you might be onto something.

▲

aquova 3 days ago | parent | prev | next [-]

> You're absolutely right!

Claude?

▲

elcritch 3 days ago | parent [-]

Or some sarcasm given their comment history on this thread.

	▲	lazide 3 days ago \| parent [-]
		Notably, this is also an effective way to deal with co-ercive, overly sensitive authoritarians. ‘Yes sir!’ -> does whatever they want when you’re not looking.

▲

taway1a2b3c 3 days ago | parent | prev [-]

> You're absolutely right!

Is this irony, actual LLM output or another example of humans adopting LLM communication patterns?

	▲	brookst 3 days ago \| parent [-]
		Certainly, it’s reasonable to ask this.