Remix.run Logo
CGMthrowaway 3 hours ago

> The gambling analogy completely falls apart on inspection. Slot machines have variable reward schedules by design — every element is optimized to maximize time on device. Social media optimizes for engagement, and compulsive behavior is the predictable output. The optimization target produces the addiction.

Intermittent variable rewards, whether produced by design or merely as a byproduct, will induce compulsive behavior, no matter the optimization target. This applies to Claude

ctoth 3 hours ago | parent | next [-]

Sometimes I will go out and I will plant a pepper plant and take care of it all summer long and obsessively ensure it has precisely the right amount of water and compost and so on... and ... for some reason (maybe I was on vacation and it got over 105 degrees?) I don't get a good crop.

Does this mean I should not garden because it's a variable reward? Of course not.

Sometimes I will go out fishing and I won't catch a damn thing. Should I stop fishing?

Obviously no.

So what's the difference? What is the precise mechanism here that you're pointing at? Because sometimes life is disappointing is a reason to do nothing. And yet.

roblh 3 hours ago | parent | next [-]

It's a not a binary thing, it's a spectrum. There are many elements of uncertainty in every action imaginable. I'm inclined to agree with the other commenter though, the LLM slot machine is absolutely closer on that spectrum to gambling than your example is.

Anthropic's optimization target is getting you to spend tokens, not produce the right answer. It's to produce an answer plausible enough but incomplete enough that you'll continue to spend as many tokens as possible for as long as possible. That's about as close to a slot machine as I can imagine. Slot rewards are designed to keep you interested as long as possible, on the premise that you _might_ get what you want, the jackpot, if you play long enough.

Anthropic's game isn't limited to a single spin either. The small wins (small prompts with well defined answers) are support for the big losses (trying to one shot a whole production grade program).

Aurornis an hour ago | parent | next [-]

> Anthropic's optimization target is getting you to spend tokens, not produce the right answer.

The majority of us are using their subscription plans with flat rate fees.

Their incentive is the precise opposite of what you say. The less we use the product, the more they benefit. It's like a gym membership.

I think all of the gambling addiction analogies in this thread are just so strained that I can't take them seriously. Even the basic facts aren't even consistent with the real situation.

8note 2 hours ago | parent | prev | next [-]

im on a subscription though.

they want me to not spend tokens. that way my subscription makes money for them rather than costing them electricity and degrading their GPUs

sweetjuly an hour ago | parent [-]

Wouldn't that apply only to a truly unlimited subscription? Last I looked all of their subs have a usage limit.

If you're on anything but their highest tier, it's not altogether unreasonable for them to optimize for the greatest number of plan upgrades (people who decide they need more tokens) while minimizing cancellations (people frustrated by the number of tokens they need). On the highest tier, this sort of falls apart but it's a problem easily solved by just adding more tiers :)

Of course, I don't think this is actually what's going on, but it's not irrational.

pixl97 3 hours ago | parent | prev [-]

> you'll continue to spend as many tokens as possible for as long as possible.

I mean this only works if Anthropic is the only game in town. In your analogy if anyone else builds a casino with a higher payout then they lose the game. With the rate of LLM improvement over the years, this doesn't seem like a stable means of business.

outofpaper 3 hours ago | parent | prev [-]

??? I'm pretty sure you know what the differences are. Go touch grass and tell me it's the same as looking at a plant on a screen.

Dealing with organic and natural systems will, most of the time, have a variable reward. The real issue comes from systems and services designed to only be accessible through intermittent variable rewards.

Oh, and don't confuse Claude's artifacts working most of the time with them actually optimizing to be that way. They're optimizing to ensure token usage. I.E. LLMs have been fine-tuned to default to verbose responses. They are impressive to less experienced developers, often easier to detect certain types of errors (eg. Improper typing), and will make you use more tokens.

squeaky-clean 2 hours ago | parent [-]

So gambling is fine as long as I'm doing it outside. Poker in a casino? Bad. Poker in a foresty meadow, good. Got it.

mikkupikku 2 hours ago | parent [-]

Basically true tbqh. Poker is maybe the one exception, but you're almost always better off gambling "in the wild" e.g. poker night with your buds instead of playing slots or anything else where "the house" is always winning in the long run. Are your losses still circulating in your local community, or have they been siphoned off by shareholders on the other side of the world? Gambling with friends is just swapping money back and forth, but going to a casino might as well be lighting the money on fire.

Aurornis an hour ago | parent | prev | next [-]

> Intermittent variable rewards, whether produced by design or merely as a byproduct, will induce compulsive behavior, no matter the optimization target.

This is an incorrect understanding of intermittent variable reward research.

Claims that it "will induce compulsive behavior" are not consistent with the research. Most rewards in life are variable and intermittent and people aren't out there developing compulsive behavior for everything that fits that description.

There are many counter-examples, such as job searching: It's clearly an intermittent variable reward to apply for a job and get a good offer for it, but it doesn't turn people into compulsive job-applying robots.

The strongest addictions to drugs also have little to do with being intermittent or variable. Someone can take a precisely measured abuse-threshold dose of a drug on a strict schedule and still develop compulsions to take more. Compulsions at a level that eclipse any behavior they'd encounter naturally.

Intermittent variable reward schedules can be a factor in increasing anticipatory behavior and rewards, but claiming that they "will induce compulsive behavior" is a severe misunderstanding of the science.

bonoboTP 2 hours ago | parent | prev | next [-]

And that's only bad if it's illusory or fake. This reaction evolved because it's adaptive. In slot machines the brain is tricked to believe there is some strategy or method to crack and the reward signals make the addict feel there is some kind of progress being made in return to some kind of effort.

The variability in eg soccer kicks or basketball throws is also there but clearly there is a skill element and a potential for progress. Same with many other activities. Coding with LLMs is not so different. There are clearly ways you can do it better and it's not pure randomness.

2 hours ago | parent | prev | next [-]
[deleted]
pixl97 3 hours ago | parent | prev [-]

>Intermittent variable rewards,

So you're saying businesses shouldn't hire people either?