They might not be, and their use-case might not be one I agree with. I can just imagine a plausible reality where they made a reasonable decision given the incentives and constraints, and I default to that.

I'm basically inferring how this would go down in the context I worked under, not the GP, because I don't know the details of their real context.

I think I'm seeing where I'm not being as clear as I could, though.

I'm talking about the lifecycle of a methodology for categorizing calls, regardless of whether or not it's a human categorizing them or a machine.

If your call center agent is writing summaries and categorizing their own calls, you still typically have a QA department of humans that listen to a random sample of full calls for any given agent on a schedule to verify that your human classifiers are accurately tagging calls. The QA agents will typically listen to them at like 4x speed or more, but mostly they're just sampling and validating the sample.

The same goes for _any_ automated process you want to apply at scale. You run it in parallel to your existing methodology and you randomly sample classified calls, verifying that the results were correct and you _also_ compare the overall results of the new method to the existing one, because you know how accurate the existing method is.

But you don't do that for _every_ call.

You find a new methodology you think is worth trying and you trial it to validate the results. You compare the cost and accuracy of that method against the cost and accuracy of the old one. And you absolutely would often have a real human listen to full calls, just not _all_ of them.

In that respect, LLMs aren't particularly special. They're just a function that takes a call and returns some categories and metadata. You compare that to the output of your existing function.

But it's all part of the: New tech consideration? -> Set up conditions to validate quantitatively -> run trials -> measure -> compare -> decide

Then on a schedule you go back and do another analysis to make sure your methodology is still providing the accuracy you need it to, even if you haven't change anything

▲

Imustaskforhelp 3 days ago | parent [-]

Man firstly I wanted to say that I loved your comment to which I responded to and then this comment too. I feel actually happy reading it and maybe its hard explaing it but maybe its because I learned something new.

So firstly, I thought that you meant that they had to listen to every call so uh yeah a misunderstanding since I admittedly don't know much about it, but still its great to hear from an expert.

I also don't know about the GP's context but I truly felt like this because of how I said in some other comments too on how people are just slapping AI stickers and markets rewarding it even though they are mostly being reckless in how they are using AI (which the post basically says) and I thought of them as the same, though I still doubt them though. Only more context from their side can tell.

Secondly, I really appreciate the paragraph that you wrote about testing different strategies and almost how indepth you went into man. Really feel like one of those comments that I feel like will be useful for me one day or the other Seriously thanks!

	▲	doorhammer 3 days ago \| parent [-]
		Hey, thanks for saying that. I have huge gaps in time commenting on HN stuff because tbh, it's just social anxiety I don't need to sign up for :\| so I really value someone taking the time to express appreciation if they got something out of my novels. I don't ever want to come across like I think I know what's up better than someone else. I just want to share my perspective given my experience and if I'm wrong, hope someone will be kind when they point it out. Tbh it's been awhile since I've worked directly in a call center (I've done some consulting type stuff here and there since then, but not much) so I'm mostly just extrapolating based on new tech and people I still know in that industry. Fwiw, the way I try to approach interpreting something like the GPs post is to try to predict the possible realities and decide which ones I think are most plausible. After that I usually contribute the less represented perspective--but only if I think it's plausible. I think the reality you were describing is totally plausible. My gut feeling is that it's probably not what's happening, but I wouldn't bet any money on that. If someone said "Pick a side. I'll give you $20k if your right and take $20k if you're wrong" I'm just flat out not participating, lol. If I _had_ to participate I'd reluctantly take benefit-of-the-doubt side, but I wouldn't love having to commit to something I'm not at all confident about As it stands it's just a fun vehicle to talk about call center dynamics. Weirdly, I think they're super interesting