Remix.run Logo
nerdjon 7 hours ago

Even though these tools are showing time and time again that they have serious reliability issues, somehow people still think it is a good idea to use them for critical decisions.

Still regularly get wrong information from google’s search AI.

Really starting to wonder if common sense is ever going to come back with new tech, but I fear it is going to require something truly catastrophic to happen.

bubblewand 5 hours ago | parent | next [-]

I’ve got a popcorn reserve at hand to watch the show when the massive security breaches happen and people start freaking out. And/or a lawsuit gets discovery of a company’s LLM history and it’s every bit as awful for them as we all know it will be and the rest of corporate America pumps the brakes.

These systems are borderline useless if you don’t give them dangerous levels of access to data and generate tons of juicy chat history with them. What’s coming is very predictable.

lkbm 7 hours ago | parent | prev | next [-]

> Still regularly get wrong information from google’s search AI.

The fact that the model most hyper-optimized for cheap+fast makes mistakes is not a particular compelling argument.

mayneack 5 hours ago | parent | next [-]

Then Google shouldn't be using something so unreliable for anything important. Arguing that random users should know the difference between cheap and frontier models is also not compelling. It's all the same "AI" to most people.

raddan 6 hours ago | parent | prev [-]

You are mistaken. ChatGPT Health [1] is a model specifically designed for health applications and was co-developed with a benchmark suite, HealthBench [2], for testing against health conditions. This study suggests that the people working on HealthBench have some concerning external validity problems.

[1] https://openai.com/index/introducing-chatgpt-health/

[2] https://cdn.openai.com/pdf/bd7a39d5-9e9f-47b3-903c-8b847ca65...

GaryBluto 6 hours ago | parent [-]

GP was referring to Google's search AI not ChatGPT Health.

yodsanklai 6 hours ago | parent | prev | next [-]

It's a strange paradigm shift, where the tool is right and useful most of than not, but also make expensive mistakes that would have been spotted easily by an expert.

nxm 5 hours ago | parent [-]

Human experts make expensive mistakes all the time

yodsanklai 8 minutes ago | parent [-]

Not at the same rate than AI supervised by a non-expert.

duskdozer 7 hours ago | parent | prev [-]

It's really the "common sense" i.e. believing things without thinking because they "sound right" or because it's what your parents told you a lot growing up or because you watched an ad saying it a hundred times that's the issue. People don't want "the truth" or uncomfortable realities; they want comfortable, easily digestible bullshit. Smooth talkers filled the role before and LLMs are filling that role now.