| ▲ | simonw 3 days ago |
| I think you are missing the point of the analogy: a lossy encyclopedia is obviously a bad idea, because encyclopedias are meant to be reliable places to look up facts. |
|
| ▲ | latexr 3 days ago | parent | next [-] |
| And my point is that “lossy” does not mean “unreliable”. LLMs aren’t reliable sources of facts, no argument there, but a true lossy encyclopaedia might be. Lossy algorithms don’t just make up and change information, they remove it from places where they might not make a difference to the whole. A lossy encyclopaedia might be one where, for example, you remove the images plus gramatical and phonetic information. Eventually you might compress the information where the entry for “dog” only reads “four legged creature”—which is correct but not terribly helpful—but you wouldn’t get “space mollusk”. |
| |
| ▲ | simonw 3 days ago | parent [-] | | I don't think a "true lossy encylopedia" is a thing that has ever existed. | | |
| ▲ | latexr 3 days ago | parent | next [-] | | One could argue that’s what a pocket encyclopaedia (those exist) is. But even if we say they don’t, when you make up a term by mushing two existing words together it helps if the term makes sense. Otherwise, why even use the existing words? You called it a “lossy enyclopedia” and not a “spaghetti ice cream” for a reason, presumably so the term evokes an image or concept in the mind of the reader. If it’s bringing up a different image than what you intended, perhaps it’s not a good term. I remember you being surprised when the term “vibe coding” deviated from its original intention (I know you didn’t come up with it). But frankly I was surprised at your surprise—it was entirely predictable and obvious how the term was going to be used. The concept I’m attempting to communicate to you is that when you make up a term you have to think not only of the thing in your head but also of the image it conjures up in other people’s minds. Communication is a two-way street. | | |
| ▲ | nyeah 3 days ago | parent [-] | | I think you're saying that "pocket encyclopedia" is one definition of "lossy encyclopedia" that may occur to people (or that may get marketed on purpose). But that's a very poor definition of LLMs. And so the danger is that people may lock onto a wildly misleading definition. Am I getting the point? |
| |
| ▲ | ianburrell 3 days ago | parent | prev | next [-] | | All encyclopedias are lossy. They curate the info they include, only choosing important topics. Wikipedia is lossy. They delete whole articles for irrelevance. They edit changes to make them more concise. They require sources for facts. All good things, but Wikipedia is a subset of human knowledge. | |
| ▲ | prerok 2 days ago | parent | prev [-] | | Since sibling comments all seem to have concentrated on idealistic good intent, I would also like to point out a different side of things. I grew up in socialism. Since we've transitioned to democracy, I learned that I have to unlearn some things. Our encyclopedias were not inaccurate but were not complete. It's like lying through omission. And as the old saying goes, half-truths are worse than lies. Whether this would be deemed as a lossy encyclopedia, I don't know. What I am certain of, however, is that it was accurate but omitted important additional facts. And that is what I see in LLMs as well. Overall, it's accurate, except in cases where an additional fact would alter the conclusion. So, it either could not find arguments with that fact, or it chose to ignore them to give an answer and could be prompted into taking them into account or whatever. What I do know is that LLMs of today give me the same hibbie-jibbies that rereading those encyclopedias of my youth give me. |
|
|
|
| ▲ | baq 3 days ago | parent | prev | next [-] |
| A lossy encyclopedia which you can talk to and it can look up facts in the lossless version while having a conversation OTOH is... not a bad idea at all, and hundreds of millions of people agree if traffic numbers are to be believed. (but it isn't and won't ever be an oracle and apparently that's a challenge for human psychology.) |
| |
| ▲ | simonw 3 days ago | parent [-] | | Completely agree with you - LLMs with access to search tools that know how to use them (o3, GPT-5, Claude 4 are particularly good at this) mostly paper over the problems caused by a lossy set of knowledge in the model weights themselves. But... end users need to understand this in order to use it effectively. They need to know if the LLM system they are talking to has access to a credible search engine and is good at distinguishing reliable sources from junk. That's advanced knowledge at the moment! | | |
| ▲ | johnecheck 3 days ago | parent | next [-] | | From earlier today: Me: How do I change the language settings on YouTube? Claude: Scroll to the bottom of the page and click the language button on the footer. Me: YouTube pages scroll infinitely. Claude: Sorry! Just click on the footer without scrolling, or navigate to a page where you can scroll to the bottom like a video. (Videos pages also scroll indefinitely through comments) Me: There is no footer, you're just making shit up Claude: [finally uses a search engine to find the right answer] | | |
| ▲ | pbhjpbhj 3 days ago | parent [-] | | IME, eventually, after a long time, the scrolling stops and you can get to the footer. YMMV! |
| |
| ▲ | gf000 3 days ago | parent | prev [-] | | Slightly off topic, but my experience is that they are pretty terrible at using search tools.. They can often reason themselves into some very stupid direction, burning all the tokens for no reason and failing to reply in the end. |
|
|
|
| ▲ | checkyoursudo 3 days ago | parent | prev | next [-] |
| I am sympathetic to your analogy. I think it works well enough. But it falls a bit short in that encyclopedias, lossy or not, shouldn't affirmatively contain false information. The way I would picture a lossy encyclopedia is that it can misdirect by omission, but it would not change A to ¬A. Maybe a truthy-roulette enclyclopedia? |
| |
| ▲ | tomrod 3 days ago | parent [-] | | I guarantee every encyclopedia has mistakes. | | |
| ▲ | Jensson 2 days ago | parent [-] | | I remember a study where they checked if wikipedia had more errors than paper encyclopedias, and they found there were about as many errors in both. That study ended the "you can't trust wikipedia" argument, you can't trust anything but wikipedia is an as good as it gets second hand reference. |
|
|
|
| ▲ | butlike 3 days ago | parent | prev | next [-] |
| I don't like the confident hallucinations of LLMs either, but don't they rewrite and add entries in the encyclopedia every few years? Implicitly that makes your old copy "lossy" Again, never really want a confidently-wrong encyclopedia, though |
|
| ▲ | rynn 3 days ago | parent | prev [-] |
| Aren't all encyclopedias 'lossy'? They are all partial collections of information; none have all of the facts. |
| |
| ▲ | prerok 2 days ago | parent [-] | | There's an important difference as to what is omitted. An encyclopedia could say "general relativity is how the universe works" or it could say "general relativity and quantum mechanics describe how we understand the universe today and scientists are still searching for universal theory". Both are short but the first statement is omitting important facts. Lossy in the sense of not explaining details is ok, but omitting swathes of information would be wrong. |
|