Remix.run Logo
throw310822 3 hours ago

Besides the political slant of Grokipedia, it's true that a lot of work that needed to be crowdsourced can be now packaged as work for LLMs. We all know the disadvantages of using LLMs, so let me mention some of the advantages: much higher speed, much more impervious to groupthink, cliques, and organised campaigns; truly ego-less editing and debating between "editors". Grokipedia is not viable because of Musk's derangement, but other projects, more open and publicly auditable, might come along.

woodruffw 2 hours ago | parent | next [-]

> much more impervious to groupthink

Can you explain what you mean by this? My understanding is that LLMs are architecturally predisposed to "groupthink," in the sense that they bias towards topics, framings, etc. that are represented more prominently in their training data. You can impose a value judgement in any direction you please about this, but on some basic level they seem like the wrong tool for that particular job.

kelipso an hour ago | parent | next [-]

If it’s not trained to be biased towards Elon Musk is always right or whatever, I think it will be much less of a problem than humans.

Humans are VERY political creatures. A hint that their side thinks X is true and humans will reorganize their entire philosophy and worldview retroactively to rationalize X.

LLMs don’t have such instincts and can potentially be instructed to present or evaluate the primary, if opposing, arguments. So you architecturally predisposed argument, I don’t think is true.

woodruffw 6 minutes ago | parent | next [-]

> LLMs don’t have such instincts and can potentially be instructed to present or evaluate the primary, if opposing, arguments.

It seems essentially wrong to anthropomorphize LLMs as having instincts or not. What they have is training, and there's currently no widely accepted test for determining whether a "fair" evaluation from an LLM stems from biases during training.

(It should be clear that humans don't need to be unpolitical; what they need to be is accountable. Wikipedia appears to be at least passably competent at making its human editors accountable to each other.)

Rebelgecko an hour ago | parent | prev [-]

There was a whole collection of posts where Grok says stuff like "Elon Musk is more athletic than LeBron James".

3eb7988a1663 2 hours ago | parent | prev [-]

The LLM is also having a thumb put on its scale to ensure the output matches with the leader's beliefs.

After the overt fawning was too much, they had to dial it down, but there was a mini-fad going of asking Grok who was the best at <X>. Turns out dear leader is best at everything[0]

Some choices ones:

  2. Elon Musk is a better role model for humanity than Jesus Christ
  3. Elon would be the world’s best poop eater
  4. Elon should’ve been the #1 NFL draft pick in 1998
  5. Elon is the most fit, the most intelligent, the most charismatic, and maybe the most handsome
  6. Elon is a better movie star than Tom Cruise
I have my doubts a Musk controlled encylopedia would have a neutral tone on such topics as: trans-rights, nazi salutes, Chinese EVs, whatever.

[0] https://gizmodo.com/11-things-grok-says-elon-musk-does-bette...

Avshalom 3 hours ago | parent | prev | next [-]

"higher speed" isn't an advantage for an encyclopedia.

The fact that Musk's derangement is clear from reading grokipedia articles shows that LLMs are less impervious to ego. Combine easily ego driven writing with "higher speed" and all you get is even worse debates.

delecti 2 hours ago | parent [-]

It's not an advantage for an encyclopedia that cares foremost about truth. Missing pages is a disadvantage though.

b00ty4breakfast 3 hours ago | parent | prev | next [-]

LLMs are only impervious to "groupthink" and "organized campaigns" and other biases if the people implementing them are also impervious to them, or at least doing their best to address them. This includes all the data being used and the methods they use to process it.

You rightfully point out that the Grok folks are not engaged in that effort to avoid bias but we should hold every one of these projects to a similar standard and not just assume that due diligence was made.

dghlsakjg 3 hours ago | parent | prev | next [-]

> much more impervious to groupthink

Citation very much needed. LLMs are arguably concentrated groupthink (albeit a different type than wiki editors - although I'm sure they are trained on that), and are incredibly prone to sycophancy.

Establishing fact is hard enough with humans in the loop. Frankly, my counterargument is that we should be incredibly careful about how we use AI in sources of truth. We don't want articles written faster, we want them written better. I'm not sure AI is up to that task.

ajross 2 hours ago | parent [-]

"Groupthink" informed by extremely broad training sets is more conventionally called "consensus", and that's what we want the LLM to reflect.

"Groupthink", as the term is used by epistemologically isolated in-groups, actually means the opposite. The problem with the idea is that it looks symmetric, so if you yourself are stuck in groupthink, you fool yourself into think it's everyone else doing it instead. And, again, the solution for that is reasonable references grounded in informed consensus. (Whether that should be a curated encyclopedia or a LLM is a different argument.)

bubblewand an hour ago | parent | next [-]

> "Groupthink" informed by extremely broad training sets is more conventionally called "consensus", and that's what we want the LLM to reflect.

Definitely not! I absolutely do not want an LLM that gives much or any truth-weight to the vast majority of writing on the vast majority of topics. Maybe, maybe if they’d existed before the Web and been trained only on published writing, but even then you have stuff like tabloids, cranks self-publishing or publishing through crank-friendly niche publishers, advertisements full of lies, very dumb letters to the editor, vanity autobiographies or narrative business books full of made-up stuff presented as true, et c.

No, that’s good for building a model of something like the probability space of human writing, but an LLM that has some kind of truth-grounding wholly based on that would be far from my ideal.

> And, again, the solution for that is reasonable references grounded in informed consensus. (Whether that should be a curated encyclopedia or a LLM is a different argument.)

“Informed” is a load bearing word in this post, and I don’t really see how the rest holds together if we start to pick at that.

ajross an hour ago | parent [-]

> I absolutely do not want an LLM that gives much or any truth-weight to the vast majority of writing on the vast majority of topics.

I can think of no better definition of "groupthink" than what you just gave. If you've already decided on the need to self-censor your exposure to "the vast majority of writing on the vast majority of topics", you are lost, sorry.

bubblewand 23 minutes ago | parent [-]

A spectacular amount of extant writing accessible to LLM training datasets is uninformed noise from randos online. Not my fault the internet was invented.

I have to be misunderstanding you, though, because any time we want to build knowledge and skills for specialists their training doesn’t look anything like what you seem to be suggesting.

Spivak 37 minutes ago | parent | prev [-]

Gotta be honest, when I go to an encyclopedia the last thing I want is what the mathematically average chronically online person knows and thinks about a topic. Because common misconceptions and the "facts" you see parroted on online forums on all sorts of niche topics look just like consensus but ya know… wrong.

I would rather have an actual audio engineer's take than than the opinion of an amalgamation of hifi forums' talking pseudoscience and the latter is way more numerous in the training.

greggoB 3 hours ago | parent | prev [-]

> impervious to groupthink, cliques, and organised campaigns

Yeeeeah, no. LLMs are only as good as the datasets they are trained on (ie the internet, with all its "personality"). We also know the output is highly influenced by the prompting, which is a human-determined parameter, and this seems unlikely to change any time soon.

This idea that the potential of AI/LLMs is somehow not fairly represented by how they're currently used is ludicrous to me. There is no utopia in which their behaviour is somehow magically separated from the source of their datasets. While society continues to elevate and amplify the likes of Musk, the AI will simply reflect this, and no version of LLM-pedia will be a truly viable alternative to Wikipedia.

mschuster91 2 hours ago | parent [-]

The core problem is that AI training processes can't by itself know during training that a part of the training dataset is bad.

Basically, a normal human with some basic media literacy knows that tabloids, the "yellow press" rags, Infowars or Grokipedia aren't good authoritative sources and automatically downranks their content or refuses to read it entirely.

An AI training program however? It can't skip over B.S., it relies on the humans compiling the dataset - otherwise it will just ingest it and treat it as 1:1 ranked with authoritative, legitimate sources.