Remix.run Logo
KPMG pulls report on AI usage due to apparent hallucinations(techcrunch.com)
94 points by Brajeshwar 5 hours ago | 14 comments
simonw 32 minutes ago | parent | next [-]

This isn't the first time this has happened, either. I do not understand how these consultancies - who sell these "reports" for six or seven digit sums - continue to mess this up. It should be excruciatingly embarrassing for them.

I guess nobody ever got fired for paying KPMG and friends for an expensive report that supported their priors.

gdulli 3 hours ago | parent | prev | next [-]

> Professional services firm KPMG has pulled a report titled, “Redefining excellence in the age of agentic AI,”

Well they were true to their word about demonstrating a new and increasingly relevant definition of "excellence."

Scoundreller 3 hours ago | parent | prev | next [-]

Gartner is going to have to pull a loooot of reports over the years

jruohonen 5 hours ago | parent | prev | next [-]

Go, GPTZero!

XenophileJKO 2 hours ago | parent | prev | next [-]

The crazy thing is the level of effort to say, "have a sub agent validate all references and figures" is so low. I'm paraphrasing, but you don't need much more than that. It would have prevented 99% of the face palms.

I use this regularly for my personal financial research system. Even flagship models make mistakes. Though currently the issue is usually the model using a figure from and older report. Cross-check reduces that dramatically.

watwut 14 minutes ago | parent | next [-]

Thinking that such prompt will cause the report to be factual is root issue. No it wont, no it is not enough.

iugtmkbdfil834 an hour ago | parent | prev | next [-]

Eh.. without going into too many details, having seen some face palms at work, I realized that the anecdotes may be closer to a pattern than I would like to believe, which prompted me to start basic howtos available company-wide.

I kinda get it, without experience and trying, how are they to know ( unless they are already 'into it')? After all, corporate training is laughable at best.

modzu an hour ago | parent | prev [-]

dont be so sure they didnt. they can go back and forth hallucinating with each other

figassis 19 minutes ago | parent [-]

This is where the absolutism of let agents to 100% of the work fails. You get adversarial agents pulling all reverences into a table, they might miss some, so run this a few times.

Then have another set of agents, with skills like web browsing (to verify that links actually exist, maybe that references and abstracts actually match, etc), have one engineer (or agent) write a small script to help with this (just make sure you test it, and a bit).

So your work is not verified until your references table is 90% green checkmarks, maybe with uncertainty figures.

A human can then verify the ones with under 90% certainty.

This alone gets you a long way there. Does not costs the millions they're being paid.

It's quite interesting that these companies marketed themselves as them best of the best in excellence, accept no mistakes. I can imagine the countless keynotes and books about this. Or the sales pitches.

Has always been a lie, they just understood how to hide it. Today they don't, and it's embarrassing.

e12e 11 minutes ago | parent [-]

> A human can then verify the ones with under 90% certainty.

How about the author actually reads the finished report a couple of times and checks all the references?

It really is the lowest bar - even lower maybe than running a spell check.

ChrisArchitect 5 hours ago | parent | prev | next [-]

[dupe] https://news.ycombinator.com/item?id=48515733

wglb 4 hours ago | parent [-]

The register article is better.

rconti 3 hours ago | parent [-]

Every once in awhile, someone utters a truly unique statement.

cryo32 an hour ago | parent | prev [-]

KPMG got called out only now for bullshit and hallucinations?