Remix.run Logo
Alifatisk 5 hours ago

In my experience, Grok 4 expert performs way worse then what the benchmarks say.

I’ve tried it with coding, writing and instructions following. The only thing it excels at currently and searching for things across the web is+ twitter.

Otherwise, I would never use it for anything else. At coding, it always includes an error, when it patches it, it introduces another one. When writing creative text and had to follow instructions, it hallucinates a lot.

Based on my experience, I am suspecting XAI for bench-maxing on Artificial Analysis because no way Grok 4 expert performs close to Gpt-5.2, Claude sonnet 4.5 and Gemini 3 pro