Remix.run Logo
lanyard-textile 7 hours ago

This comment thread is a good learner for founders; look at how much anguish can be put to bed with just a little honest communication.

1. Oops, we're oversubscribed.

2. Oops, adaptive reasoning landed poorly / we have to do it for capacity reasons.

3. Here's how subscriptions work. Am I really writing this bullet point?

As someone with a production application pinned on Opus 4.5, it is extremely difficult to tell apart what is code harness drama and what is a problem with the underlying model. It's all just meshed together now without any further details on what's affected.

zarzavat 7 hours ago | parent | next [-]

These threads are always full of superstitious nonsense. Had a bad week at the AIs? Someone at Anthropic must have nerfed the model!

The roulette wheel isn't rigged, sometimes you're just unlucky. Try another spin, maybe you'll do better. Or just write your own code.

2001zhaozhao 5 hours ago | parent | next [-]

Start vibe-coding -> the model does wonders -> the codebase grows with low code quality -> the spaghetti code builds up to the point where the model stops working -> attempts to fix the codebase with AI actually make it worse -> complain online "model is nerfed"

NewsaHackO 4 hours ago | parent [-]

I remember there was a guy that had three(!) Claude Max subscriptions, and said he was reducing his subscriptions to one because of some superfluous problem. I'm thinking, nah, you are clearly already addicted to the LLM slot machine, and I doubt you will be able to code independently from agent use at this point. Antropic, has already won in your case.

teaearlgraycold 2 hours ago | parent [-]

I don’t really understand the slot machine, addiction, dopamine meme with LLM coding. Yeah it’s nice when a tool saves you time. Are people addicted to CNCs, table saws, and 3D printers?

NewsaHackO an hour ago | parent | next [-]

I don't use the agentic workflow (as I am using it for my own personal projects), but if you have ever used it, there is this rush when it solves a problem that you have been struggling with for some time, especially if it gives a solution in an approach you never even considered that it has baked in its knowledge base. It's like an "Eureka" moment. Of course, as you use it more and more, you start to get better at recognizing "Eureka" moments and hallucinations, but I can definitely see how some people keep chasing that rush/feeling you get when it uses 5 minutes to solve a problem that would have taken you ages to do (if at all).

Also, another difference is the stochastic nature of the LLMs. With table saws, CNC machines, and modern 3D printers, you kind of know what you are getting out. With LLMs, there is a whole chance aspect; sometimes, what it spits out is plainly incorrect, sometimes, it is exactly what you are thinking, but when you hit the jackpot, and get the nugget of info that elegantly solves the problem, you get the rush. Then, you start the whole bikeshedding of your prompt/models/parameters to try and hit the jackpot again.

kakacik 38 minutes ago | parent | prev | next [-]

The dopamine rush to fix the issue super quickly, close the ticket, slack / work more?

Absolutely, not understanding why you even ask. Humans are creatures of habits that often dip a bit or more into outright addictions, in one of its many forms.

wheatbond 2 hours ago | parent | prev [-]

Yes

unshavedyak 6 hours ago | parent | prev | next [-]

Part of me wonders if there's some subtle behavioral change with it too. Early on we're distrusting of a model and so we're blown away, we were giving it more details to compensate for assumed inability, but the model outperformed our expectations. Weeks later we're more aligned with its capabilities and so we become lazy. The model is very good, why do we have to put in as much work to provide specifics, specs, ACs, etc. So then of course the quality slides because we assumed it's capabilities somehow absolved the need for the same detailed guardrails (spec, ACs, etc) for the LLM.

This scenario obviously does not apply to folks who run their own benches with the same inputs between models. I'm just discussing a possible and unintentional human behavioral bias.

Even if this isn't the root cause, humans are really bad at perceiving reality. Like, really really bad. LLMs are also really difficult to objectively measure. I'm sure the coupling of these two facts play a part, possibly significant, in our perception of LLM quality over time.

youoy 3 hours ago | parent | next [-]

100% agree, and I experienced that behaviour first hand. I got confident, started giving less guidelines, and suddenly two weeks have passed and the LLM put me into a state of horrible code that looks good superficially because I trusted it too much.

mewpmewp2 5 hours ago | parent | prev [-]

Still I don't previously remember Claude constantly trying to stop conversations or work, as in "something is too much to do", "that's enough for this session, let's leave rest to tomorrow", "goodbye", etc. It's almost impossible to get it do refactoring or anything like that, it's always "too massive", etc.

andai 24 minutes ago | parent | prev | next [-]

They don't nerf the model, just lower the default reasoning effort, encourage shorter responses in the system prompt, etc. Totally different ;)

delbronski 6 hours ago | parent | prev | next [-]

Nah dude, that roulette wheel is 100% rigged. From top to bottom. No doubt about that. If you think they are playing fair you are either brand new to this industry, or a masochist.

3 hours ago | parent | prev | next [-]
[deleted]
portly 2 hours ago | parent | prev | next [-]

Good to remind this. But I also don't want to go back to pre-llm. Some dev activities are just too painful and boring, like correctly writing s3 policies. We must have discipline to decide what is worth our attention and what we should automate, because there is only so much mind energy we can spend each day.

awwaiid 3 hours ago | parent | prev | next [-]

It's also difficult to recognize that when it got it right THAT might have been the lucky week.

lnenad 5 hours ago | parent | prev | next [-]

I mean they literally said on their own end that adaptive thinking isn't working as it should. They rolled it out silently, enabled by default, and haven't rolled it back.

colordrops an hour ago | parent | prev | next [-]

Sorry but this is a ridiculous comment. It's not magic. There are countless levers that can be changed and ARE changed to affect quality and cost, and it's known that compute is scarce.

We aren't superstitious, you are just ignorant.

dakolli 5 hours ago | parent | prev [-]

Its because llm companies are literally building quasi slot machines, their UI interfaces support this notion, for instance you can run a multiplier on your output x3,x4,5, Like a slot machine. Brain fried llm users are behaving like gamblers more and more everyday (its working). They have all sorts of theories why one model is better than another, like a gambler does about a certain blackjack table or slot machine, it makes sense in their head but makes no sense on paper.

Don't use these technologies if you can't recognize this, like a person shouldn't gamble unless they understand concretely the house has a statistical edge and you will lose if you play long enough. You will lose if you play with llms long enough too, they are also statistical machines like casino games.

This stuff is bad for your brain for a lot of people, if not all.

nextaccountic 4 hours ago | parent | next [-]

I agree with the notion, except that the models are indeed different

Some day maybe they will converge into approximately the same thing but then training will stop making economic sense (why spend millions to have ~the same thing?)

leptons 4 hours ago | parent | prev [-]

100% agree with this take. As I find myself using AI to write software, it is looking like gambling. And it isn't helping stimulate my brain in ways that actually writing code does. I feel like my brain is starting to atrophy. I learn so much by coding things myself, and everything I learn makes me stronger. That doesn't happen with AI. Sure I skim through what the AI produced, but not enough to really learn from it. And the next time I need to do something similar, the AI will be doing it anyway. I'm not sure I like this rabbit hole we're all going down. I suspect it doesn't lead to good things.

SkyPuncher 41 minutes ago | parent | prev | next [-]

I agree.

I have flexibility to shift my core working hours (and what I do during N/A business hours). Knowing they're explicitly making it dumb because of load is important. It allows me to shuffle my work around and run heavy workloads late at night (plan during working hours then come click "yes" a few times in the evening).

drewnick 7 hours ago | parent | prev | next [-]

Hasn't Opus 4.5 been famously consistent while 4.6 was floating all over the place?

JohnMakin 3 hours ago | parent [-]

I'm still on 4.5. My coworkers are describing a lot of problems I just don't have. I suspect it was some combination of the larger context window, the model itself, and various bugs like the cache miss thing reported a little while ago.

Barbing 2 hours ago | parent | prev | next [-]

This is why we took business ethics & I know Dario had to too

How will your project/decision look on the front page of the Wall Street Journal? Well when a whistleblower reveals what everyone knows ($9b->$30b rev jump w/o servers growing on trees simultaneously = tough decisions), it's gonna be public anyway.

stasomatic 5 hours ago | parent | prev | next [-]

I am a neophyte regarding pros and cons of each model. I am learning the ropes, writing shell scripts, a tiny Mac app, things like that.

Reading about all the “rage switching”, isn’t it prudent to use a model broker like GH Copilot with your own harness or something like oh-my-pi? The frontier guys one up each other monthly, it’s really tiring. I get that large corps may have contracts in place, but for an in indie?

sobellian 4 hours ago | parent | prev | next [-]

This, plus the alchemical nature of these tools, seems to have made users pretty paranoid (I admit I am also guilty of paranoia). Maybe there's room for a Standard AI - we may change the prices based on market conditions, but we always give you exactly the model you ask for.

teling 5 hours ago | parent | prev | next [-]

Good shout. Wish they were more transparent about these 3 things.

kulikalov 7 hours ago | parent | prev | next [-]

Or it could be a selection bias. The ground truth is not what HN herd mentality complains about, but the usage stats.

lanyard-textile 7 hours ago | parent [-]

I suppose I come forward with my own usage stats, but it is anecdata :)

And the andecdata matches other anecdata.

Maybe I'm missing why that's selection bias.

preommr 4 hours ago | parent | prev [-]

> This comment thread is a good learner for founders;

lmao, no they shouldn't.

Public sentiment, especially on reactionary mediums like social media should be taken with a huge grain of salt. I've seen overwhelming negativity for products/companies, only for it it completely dissapear, or be entirely wrong.

It's like that meme showing members of a steam group that are boycotting some CoD game, and you can see that a bunch of them were playing in-game of the very thing they forsook.

People are fickle, and their words cheap.

lanyard-textile 3 hours ago | parent [-]

The internet is a stupid place with people who can't make up their mind, I don't disagree :)

But this isn't like a minor debacle about a brand. The flagship product had a severe degradation, and the parent company won't be forthcoming about it.

It's short term thinking. Congratulations, everyone still uses your product for now, but it diluted your brand.

Why take the risk when the alternative is so incredibly easily? Build engagement with your users and enjoy your loyal army.