Remix.run Logo
pc86 5 days ago

I hesitate to use phrases like "bait and switch" but it seems like every model gets released and is borderline awe-inspiring, then as adoption increases, and load increases, it's like it gets hit in the head with a hammer and is basically useless for anything beyond a multi-step google search.

dingnuts 5 days ago | parent | next [-]

I think it's a psychological bias of some sort. When the feeling of newness wears off and you realize the model is still kind of shit, you have an imperfect memory of the first few uses when you were excited and have repressed the failures from that period. As the hype wears off you become more critical and correctly evaluate the model

Uehreka 5 days ago | parent | next [-]

I get that it’s fun and stylish to tell people they aren’t aware of their own cognitive biases, but it’s also a difficult take to falsify, which is why I generally have a high bar for people to clear when they want to assert that something is all in people’s heads.

People seem to turn to this with a lot when the suspicion many people have is difficult to verify. And while I don’t trust a suspicion just because it’s held by a lot of people, I also won’t allow myself to embrace the comforting certainty of “it’s surely false and it’s psychological bias”.

Sometimes we just need to not be sure what’s going on.

ewoodrich 5 days ago | parent [-]

Doesn't this go both ways? A random selection of commenters online out of hundreds of thousands of devs using LLMs reporting degraded capability based on personal perception isn't exactly statistically meaningful data.

I've seen the cycle of claims going from "10x multiplier, like a team of junior devs" to "nerfed" for so many model/tool releases at this point it's hard for me not to believe there's an element of perceptual bias going on, but how much that contributes vs real variability on the backend is impossible to know for sure.

lacy_tinpot 5 days ago | parent | prev [-]

It's not because it's actually tracked and even acknowledged by the companies themselves.

citizenAlex 5 days ago | parent | prev | next [-]

I think the models deteriorate over time with more inputs. I think the noise increases like photocopies of photocopies

mh- 5 days ago | parent [-]

If you mean within an individual context window, yes, that's a known phenomenon.

If you mean over the lifetime of a model being deployed, no, that's not how these models are trained.

otabdeveloper4 5 days ago | parent | prev | next [-]

No, that's just the normal slope of the hype curve as you start figuring out how the man behind the curtain operates.

rootnod3 5 days ago | parent | prev | next [-]

AI is not useful in the long term is is unsustainable. News at 11.

j45 5 days ago | parent | prev [-]

It’s important to jump on new models super early while the rails get out in.

Anyone remember GPT4 the day it launched? :)