We're really not very good at determining which humans are trustworthy. Most people barely do better than a coin flip at detecting lies.

▲

simonw 5 hours ago | parent | next [-]

The biggest difference on this front between a human and an LLM is accountability.

You can hold a human accountable for their actions. If they consistently fall for phishing attacks you can train or even fire them. You can apply peer pressure. You can grant them additional privileges once they prove themselves.

You can't hold an AI system accountable for anything.

▲

nradov 39 minutes ago | parent | next [-]

You can hold the person (or corporate person) who owns or used the LLM accountable for its actions. It's like how dogs aren't really accountable. But if you let your dog run loose and it mauls a toddler to death then you'll probably be sued. Same thing.

(Yes, I am aware this isn't a perfect analogy because a dangerous dog can be seized and destroyed. But that's an administrative procedure and really not the same as holding a person morally or financially accountable.)

▲

Verdex 4 hours ago | parent | prev [-]

Recently, I've kind of been wondering if this is going to turn out to be LLM codegen's Achilles heal.

Imagine some sort of code component of critical infrastructure that costs the company millions per hour when it goes down and it turns out the entire team is just a thin wrapper for an LLM. Infra goes down in a way the LLM can't fix and now what would have been a few late nights is several months to spin up a new team.

Sure you can hold the team accountable by firing them. However this is a threat to someone with actual technical know how because their reputation is damaged. They got fired doing such and such so can we trust them to do it here.

For the person who LLM faked it, they just need to find another domain where their reputation won't follow them to also fake their way through until the next catastrophe.

	▲	jmogly 6 minutes ago \| parent [-]
		This is a fascinating idea, imagine a company spins up a super complex stack using llms that works, becomes vital. It breaks occasionally, they use a combination of llms, hope and prayer to keep the now vital system up and running. The system hits a limit, say data, code optimization, or number of users, and the llm isn’t able to solve the issue this time. They try to bring in a competent engineer or team of engineers but no one who could fix it is willing to take it on.

▲

InsideOutSanta 5 hours ago | parent | prev | next [-]

Yeah, so many scammers exist because most people are susceptible to at least some of them some of the time.

Also, pick your least favorite presidential candidate. They got about 50% of the vote.

▲

Exoristos 5 hours ago | parent | prev | next [-]

Your source must have been citing a very controlled environment. In actuality, lies almost always become apparent over time, and general mendaciousness is something most people can sense from face and body alone.

▲

card_zero 6 hours ago | parent | prev | next [-]

Lies, or bullshit? I mean, a guessing game like "how many marbles" is a context that allows for easy lying, but "I wasn't even in town on the night of the murder" is harder work. It sounds like you're refering to some study of the marbles variety, and not a test of smooth-talking, the LLM forte.

▲

cj 6 hours ago | parent | prev [-]

Determining trustworthiness of LLM responses is like determining who's the most trustworthy person in a room full of sociopaths.

I'd rather play "2 truths and a lie" with a human rather than a LLM any day of the week. So many more cues to look for with humans.

	▲	bluefirebrand 5 hours ago \| parent [-]
		Big problem with LLMs is if you try and play 2 truths and a lie, you might just get 3 truths. Or 3 lies.