Remix.run Logo
dwroberts 8 hours ago

> Call me when it stops making things up.

We haven’t moved past this yet

jrflo 7 hours ago | parent | next [-]

To be fair I think we'd be able to claim AGI is here if that problem is solved. At this point the models are so smart they're borderline super intelligent if they were cognizant of hallucinations and their own shortcomings. If GPT 5.5 or Opus 4.8 could tell you "I don't know" they'd certainly be "smarter" than any individual human. Some specialists might be better in niche domains, but I don't know of any humans who are experts at that level in every field.

OtomotO 7 hours ago | parent [-]

Your condition is called AI psychosis. Good news is: it's curable!

jrflo 7 hours ago | parent [-]

Care to elaborate? What is AI psychosis? How am I exhibiting it? I thought hacker news was the last place free of mindless dunking on the internet, I guess I was wrong. If you'd like to engage in a debate on the original topic of this thread I'd be more than happy to, but if you want to dunk, twitter is over at x.com now.

inigyou 4 hours ago | parent | prev | next [-]

This! It got a much wider pool of templates to copy from, so now if you ask for a 3D web game it gives you a similarly boring game using three.js instead of failing entirely. It still has no imagination and still makes up nonsense all the time. The fundamental problems haven't improved.

AndrewKemendo 7 hours ago | parent | prev [-]

I’m unaware of any humans that don’t have this error method also.

ofjcihen 7 hours ago | parent | next [-]

This argument style is always humorous. The intention is something like “so humans are as bad as AI” when the original question boils down to something like “why would I replace humans with AI?”.

sph 5 hours ago | parent | next [-]

We must give this fallacy a name. It’s the facile way out of the argument used by boosters whenever one dares to criticize LLMs.

Ukv 7 hours ago | parent | prev | next [-]

> The intention is something like “so humans are as bad as AI” when the original question boils down to something like “why would I replace humans with AI?”

If AI really is at human level quality/error rate (I don't think it is for general tasks, but there are some areas where it is), then the answer is typically cost and speed/capacity.

ofjcihen 7 hours ago | parent [-]

Have outputs from engineers traditionally been measured in cost and speed?

Remember, we aren’t just talking about the product you create. While you would measure deliverables by cost and speed are we ignoring something else? Something that could potentially be more important than either of those metrics?

Ukv 7 hours ago | parent [-]

> Have outputs from engineers traditionally been measured in cost and speed?

Yes. How long it'll take and how much it'll cost are going to be among pretty much any customer's first questions.

They're not the only considerations, and could potentially be outweighed by other concerns even when quality is the same, but I think they are the main drives of AI adoption in industry. If error rate is the same, a $1/hr (amortized) camera and machine vision model capable of checking 300ft of material for defects per minute will likely be preferred to a $10/hr human QA capable of checking 30ft per minute, for instance.

ofjcihen 7 hours ago | parent [-]

Oh machine learning has been useful for measuring deterministic and non deterministic outputs for a long time.

But that’s not the argument here, is it?

So the question still stands.

Ukv 6 hours ago | parent [-]

My understanding of your argument is (paraphrasing):

> > People try to excuse AI issues/failure modes by saying humans have them too, but even if they're equally bad then what would be the whole point of replacing a human worker with AI?

To which my response is that speed and cost are also important factors, which can often give AI the edge in considerations when quality/error rate is equal.

If you meant something other than that, you may have to specify.

ofjcihen 5 hours ago | parent [-]

Sure, let me be blunt.

Speed and cost are nothing without quality and quality is partially a product of accountability (not even considering the technical or logistical issues this is enough on its own.

An AI cannot be held accountable. It does not desire to feed its family.

Your counter argument was outside the context of this articles claims, specifically that programmers and other knowledge workers can be replaced by LLMs.

Equating simple yes/no outcomes generated by vision based machine learning is quite different than “build me a product people will be happy with” being asked of a non-deterministic machine.

Ukv 4 hours ago | parent [-]

> Speed and cost are nothing without quality

Quality was what the hypothetical was assuming had reached parity, no? ("humans are as bad as AI", "If AI really is at human level quality/error rate", etc.)

> accountability

Why could the company not still take accountability? That's already the case for non-ML automated systems, some with high failure rates. As a customer I rarely if ever care about blame being pinned on a specific employee.

> Your counter argument was outside the context of this articles claims, specifically that programmers and other knowledge workers can be replaced by LLMs.

AndrewKemendo's comment and your reply ("any humans", "replace humans") seemed to generalize, but speed and cost being important factors is still true for knowledge work. For some given level of quality, a web developer offering a lower quote with shorter turnaround time will be preferred to one offering a higher quote with longer turnaround time.

AndrewKemendo 3 hours ago | parent [-]

Correct

For the set of [all tasks humans can do] there is a subset of [things automation can do at equvalent or better error rates].

And that’s the baseline the baseline is not some platonic ideal that is never reached the baseline for a business operator is cost per delivery /error.

The idea that existing human systems are optimized or otherwise “not broken”, is the key fallacy that people keep making.

AndrewKemendo 7 hours ago | parent | prev [-]

The entire purpose of automation is to remove a capacity limited human from a continuous workflow because the workflow is more capably achieved with fewer errors than the human

See: traffic lights

ofjcihen 7 hours ago | parent | next [-]

That’s a great explanation of automation.

If I have a choice between a deterministic traffic light and a non-deterministic traffic light which one would I use?

And yes, before you say “this isn’t a comparison of non deterministic and deterministic tools, this is a comparison of two non-deterministic tools” think about what my next question might be.

overgard 6 hours ago | parent | prev [-]

An AI traffic light sounds like an excellent way to kill a lot of people

Diogenesian 7 hours ago | parent | prev | next [-]

I am unaware of any healthy human who confabulates things as arbitrarily and disastrously as a SOTA reasoning model. It is childish to say stuff like "lawyers always made up court cases" - no they didn't!

7 hours ago | parent [-]
[deleted]
whateveracct 7 hours ago | parent | prev | next [-]

? in a professional setting, my coworkers are just randomly gonna make stuff up

AndrewKemendo 7 hours ago | parent [-]

That is the entire industry of business consulting.

Boston consulting group Bain and MacKenzie make billions of years completely making shit up. same thing with Ernst and young and any of these organizations that make these “future of (insert market)” reports

whateveracct 7 hours ago | parent [-]

right, so none of my human coworkers ever

zero-sharp 7 hours ago | parent | prev [-]

Look, I don't spend most of my time online criticizing AI progress. But what does your response even mean? People hallucinating work and solutions isn't commonplace at all, right? What industry do you work in where people hallucinate with frequency?

snozolli 7 hours ago | parent [-]

I can't speak to GP's intention, but I've personally witnessed a guy on my team who was trying to position himself as the go-to technical dude. He was jockeying for a management role. When QA or customer support had questions about our products, he'd always have an answer. I would say that at least 50% of the time, his answer was completely fabricated nonsense. He'd wildly misrepresent projects that his teammates were working on. I also saw several incidents of cargo-cult programming from him. Bizarrely, this never bit him in the ass and now he's a middle manager at a FAANG. This experience leaves me without much hope for the future of software development as a career.