Remix.run Logo
jbstack 3 days ago

Are LLMs really lossier than humans? I think it depends on the context. Given any particular example, LLMs might hallucinate more and a human might do a better job at accuracy. But overall LLMs will remember far more things than a human. Ask a human to reproduce what they read in a book last year and there's a good chance you'll get either absolutely nothing or just a vague idea of what the book was about - in this context they can be up to 100% lossy. The difference here is that human memory decays over time while a LLM's memory is hardwired.

ijk 3 days ago | parent | next [-]

I think what trips people up is that LLMs and humans are both lossy, but in different ways.

The intuitions that we've developed around previous interactions are very misleading when applied to LLMs. When interacting with a human, we're used to being able to ask a question about topic X in context Y and assume that if you can answer it we can rely on you to be able to talk about it in the very similar context Z.

But LLMs are bad at commutative facts; A=B and B=A can have different performance characteristics. Just because it can answer A=B does not mean it is good at answering B=A; you have to test them separately.

I've seen researchers who should really know better screw this up, rendering their methodology useless for the claim they're trying to validate. Our intuition for how humans do things can be very misleading when working with LLMs.

withinboredom 3 days ago | parent | prev | next [-]

That's not exactly true. Every time you start a new conversation; you get a new LLM for all intents. Asking an LLM about an unrelated topic towards the end of a ~500 page conversation will get you vastly different results than at the beginning. If we could get to multi-thousand page contexts, it would probably be less accurate than a human, tbh.

jbstack 3 days ago | parent [-]

Yes, I should have clarified that I was referring to memory of training data, not of conversations.

withinboredom a day ago | parent [-]

Training data also deteriorates quite quickly as the context gets longer.

sigmoid10 3 days ago | parent | prev [-]

>Given any particular example, LLMs might hallucinate more and a human might do a better job at accuracy

This drastically depends on the example. For average trivia questions, modern LLMs (even smaller, open ones) beat humans easily.