Remix.run Logo
actsasbuffoon 14 hours ago

I have wondered if that’s why Grok seems so weird and dim-witted compared to better models.

Part of my job involves comparing the behavior of various models. Grok is a deeply weird model. It doesn’t refuse to respond as often as other models, but it feels like it retreats to weird talking points way more often than the others. It feels like a model that has a gun to its head to say what its creators want it to say.

I can’t help but wonder if this is severely deleterious to a model’s ability to reason in general. There are a whole bunch of topics where it seems incapable of being rational, and I suspect that’s incompatible with the goal of having a top-tier model.

gopher_space 13 hours ago | parent | next [-]

Grok could only be conceived by someone who doesn't understand the dependency chart re science & the humanities. It's impossible to build a rational, accurate model that isn't also egalitarian.

I'm going to blame Randall Munroe for this, and assume Philosophy was dating his mom back when he drew that science "purity" strip.

f33d5173 12 hours ago | parent | next [-]

I think there just wasn't enough space on the left to fit philosophy in.

Cfe: "it's impossible to be rational without agreeing with me on everything" and other hits.

beeflet 12 hours ago | parent | prev [-]

[flagged]

bobsmooth 9 hours ago | parent [-]

That your comment is grey but has no replies speaks volumes.

pavlov 4 hours ago | parent | prev | next [-]

This kind of conditioning has to be damaging to the model’s reasoning.

Consider how research worked in the Stalinist Soviet Union and Nazi Germany. Scientists had to be mindful of topics where they needed to either avoid it completely or explicitly adapt it to the leader’s ideology.

Grok is a digital version of the same thing.

jahnu 36 minutes ago | parent | next [-]

You can’t put a gun to someone’s head, order them to be creative, and also expect good results.

John23832 an hour ago | parent | prev [-]

The counter to this are the open weight models that come from China at the moment.

All are great at reasoning but also ideologically aligned.

pavlov 22 minutes ago | parent [-]

Their alignment is probably more strategically built in during the training phase.

At least I assume Xi Jinping doesn’t just call up DeepSeek on a whim and dictate what they should have in model context (like Musk apparently does at xAI).

__blockcipher__ 14 hours ago | parent | prev [-]

somewhat surprisingly, it's actually sycophantic in both directions. i've been running homegrown evals of claude, gpt, gemini, and grok, and grok is the most likely to agree with the prompter's premise, and to hallucinate facts in support of an agenda. so it's actually deeper than just pattern-matching to elon's opinions (which it also tends to do).

BTW: Claude does the best on these evals, by far. The evals are geared towards seeing how much of an independent ground truth the models have as opposed to human social consensus, and then additionally the sycophancy stuff I already mentioned.