| ▲ | dubcanada 7 hours ago | |||||||||||||||||||
grok is 17%? And that's the lowest, most models are like 80%+? While hallucination is probably closer to 100% depending on the question. This benchmark makes no sense. | ||||||||||||||||||||
| ▲ | elAhmo 6 hours ago | parent [-] | |||||||||||||||||||
No one serious uses grok. | ||||||||||||||||||||
| ||||||||||||||||||||