Remix.run Logo
jamespo 8 hours ago

Yes LLMs that don't disclose sources are much better.

browningstreet 7 hours ago | parent | next [-]

The LLMs I use all supply references.

onraglanroad 7 hours ago | parent [-]

Indeed! Sometimes even more than actually exist!

I don't think LLMs can be faulted on their enthusiasm for supplying references.

tialaramex 6 hours ago | parent [-]

Yup, there's a wonderful, presumably LLM generated, response to somebody explaining how trademark law actually works, the LLM response insists that explanation was all wrong and cites several US law cases. Most of the cases don't exist, the rest aren't about trademark law or anywhere close. But the LLM isn't supposed to say truths, it's a stochastic parrot, it makes what looks most plausible as a response. "Five" is a pretty plausible response to "What is two plus three?" but that's not because it added 2 + 3 = 5

johnisgood 6 hours ago | parent [-]

"Five" is not merely "plausible". It is the uniquely correct answer, and it is what the model produces because the training corpus overwhelmingly associates "2 + 3" with "5" in truthful contexts.

And the stochastic parrot framing has a real problem here: if the mechanism reliably produces correct outputs for a class of problems, dismissing it as "just plausibility" rather than computation becomes a philosophical stance rather than a technical critique. The model learned patterns that encode the mathematical relationship. Whether you call that "understanding" or "statistical correlation" is a definitional argument, not an empirical one.

The legal citation example sounds about right. It is a genuine failure mode. But arithmetic is precisely where LLMs tend to succeed (at small scales) because there is no ambiguity in the training signal.

rvnx 8 hours ago | parent | prev | next [-]

LLMs have their issues too.

In everyday life, you cannot read 20 books about a topic about everything you are curious about, but you can ask 5 subject-experts (“the LLMs”) in 20 seconds

some of them who are going to check on some news websites (most are also biased)

Then you can ask for summaries of pros and cons, and make your own opinions.

Are they hallucinating ? Could be. Are they lying ? Could be. Have they been trained on what their masters said to say ? Could be.

But multiplying the amount of LLMs reduce the risk.

For example, if you ask DeepSeek, Gemini, Grok, Claude, GLM-4.7 or some models that have no guardrails, what they think about XXX, then perhaps there are interesting insights.

jamespo 7 hours ago | parent [-]

This may shock you, but wikipedia provides multiple sources, it even links to them. Where do you think the LLMs are getting their data from?

dfxm12 6 hours ago | parent [-]

To further this, articles also have an edit history and talk page. Even if one disagrees with consensus building or suspects foul play and they're really trying to get to the bottom of something, all the info is there on Wikipedia!

If one just wants a friendly black box to tell them something they want to hear, AI is known to do that.

CamperBob2 7 hours ago | parent | prev [-]

LLMs disclose sources now.

tux3 7 hours ago | parent [-]

Right. Try clicking those sources, half the time there is zero relation to the sentence. LLMs just output what they want to say, and then sprinkle in what the web search found on random sentences.

And not just bottom of the barrel LLMs. Ask Claude about Intel PIN tools, it will merrily tell you that it "Has thread-safe APIs but performance issues were noted with multi-threaded tools like ThreadSanitizer" and then cite the Disney Pins blog and the DropoutStore "2025 Pin of the Month Bundle" as an inline source.

Enamel pins. That's the level of trust you should have when LLMs pretend to be citing a source.

CamperBob2 5 hours ago | parent [-]

Did I say not to check the sources?

Or is that something you made up?

jamespo 3 hours ago | parent [-]

Ah so irrelevant / invalid sources are OK...

CamperBob2 3 hours ago | parent [-]

Only the first couple of time derivatives matter. The models are better than they were. Are you?