Remix.run Logo
WhatIsDukkha 3 days ago

I would never ask any of these questions of an LLM (and I use and rely on LLMs multiple times a day), this is a job for a computer.

I would also never ask a coworker for this precise number either.

jcynix 3 days ago | parent | next [-]

My reasoning for the plain question was: as people start to replace search engines by AI chat, I thought that asking "plain" questions to see how trustworthy the answers might be, would be a good test. Because plain folks will ask plain questions and won't think about the subtle details. They would not expect a "precise number" either, i.e. not 23:06 PDT, but would like to know if this weekend would be fine for a trip or the previous or next weekend would be better to book a "dark sky" tour.

And, BTW, I thought that LLMs are computers too ;-0

WhatIsDukkha 3 days ago | parent [-]

I think its much better to help people learn that an LLM is "not" a computer (even if it technically is).

Thinking its a computer makes you do dumb things with them that they simply have never done a good job with.

Build intuitions about what they do well and intuitions about what they don't do well and help others learn the same things.

Don't encourage people to have poor ideas about how they work, it makes things worse.

Would you ask an LLM a phone number? If it doesn't use a function call the answer is simply not worth having.

achierius 3 days ago | parent | prev | next [-]

But it's a good reminder when so many enterprises like to claim that hallucinations have "mostly been solved".

WhatIsDukkha 3 days ago | parent [-]

I agree with you partially, BUT

when are the long list of 'enterprise' coworkers, who have glibly and overconfidently answered questions without doing math or looking them up, going to be fired?

stavros 3 days ago | parent | prev | next [-]

First we wanted to be able to do calculations really quickly, so we built computers.

Then we wanted the computers to reason like humans, so we built LLMs.

Now we want the LLMs to do calculations really quickly.

It doesn't seem like we'll ever be satisfied.

WhatIsDukkha 3 days ago | parent [-]

Ask the LLM what calculations you might or should do (and how you might implement and test those calculations) is pretty wildly useful.

ec109685 3 days ago | parent | prev [-]

These models are proclaiming near AGI, so they should be smarter than hallucinating an answer.