Hmmm... I wonder if this is why some of the results I've gotten over the past few days have been pretty bad. It's easy to dismiss poor results on LLM quality variance from prompt to prompt vs. something like this where the quality is actively degraded without notification. I can't say this is in fact what I'm experience, but it was noticeable enough I'm going to check.

▲

jmathai 10 months ago | parent | next [-]

Never occurred to me that the response changes based on load. I’ve definitely noticed it seems smarter at times. Makes evaluating results nearly impossible.

▲

kridsdale1 10 months ago | parent [-]

My human responses degrade when I’m heavily loaded and low on resources, too.

	▲	TeMPOraL 10 months ago \| parent [-]
		Unrelated. Inference doesn't run in sync with the wall clock; it takes whatever it takes. The issue is more like telling a room of support workers they are free to half-ass the work if there's too many calls, so they don't reject any until even half-assing doesn't lighten the load enough.

▲

Seattle3503 10 months ago | parent | prev | next [-]

This is one reason closed models suck. You can't tell if the bad responses are due to something you are doing, or if the company you are paying to generate the responses is cutting corners and looking for efficiencies, eg by reducing the number of bits. It is a black box.

▲

mirsadm 10 months ago | parent [-]

To be fair even if you did know it would still behave the same way.

	▲	TeMPOraL 10 months ago \| parent [-]
		Still, knowing is what makes the difference between gaslighting and merely subpar/inconsistent service.

▲

baxtr 10 months ago | parent | prev | next [-]

Recently I started wondering about the quality of ChatGPT. A couple of instances I was like: "hmm, I’m not impressed at all by this answer, I better google it myself!"

Maybe it’s the same effect over there as well.

	▲	dave84 10 months ago \| parent [-]
		Recently I asked 4o to ‘try again’ when it failed to respond fully, it started telling me about some song called Try Again. It seems to lose context a lot in the conversations now.

▲

55555 10 months ago | parent | prev [-]

Same experience here.