grok is 17%? And that's the lowest, most models are like 80%+?

While hallucination is probably closer to 100% depending on the question. This benchmark makes no sense.

No one serious uses grok.

	▲	d0gsg0w00f 23 minutes ago \| parent \| next [-]
		Why not? Honest question.
	▲	ajdegol 6 hours ago \| parent \| prev \| next [-]
		@grok is this true?
	▲	RALaBarge 5 hours ago \| parent \| prev [-]
		YMMV but Grok 4.1 Fast can usually find via static analysis a few things that other models dont seem to catch with the same prompt