Remix.run Logo
cherioo 5 hours ago

The interesting thing I find is how Anthropic has been more consistently improving over time in the last few years, that allows it to catchup and surpass OpenAI and Google. The latter two have pretty much plateau over the last year or so. GPT 5.5 is somehow not moving the needle at all.

I hope to see the other labs can bring back competition soon!

XCSme 5 hours ago | parent [-]

Gpt 5.5 is quite a big leap, it's a lot better than opus 4.7 for agentic coding

energy123 5 hours ago | parent | next [-]

Arena only allows very small context sizes, so it's a noisy benchmark for what we care about IRL.

mettamage 3 hours ago | parent | prev [-]

Better in what ways? I'm just curious about your experience.

XCSme 3 hours ago | parent [-]

Consistency, not making mistakes.

mettamage 3 hours ago | parent [-]

Ahh... that is indeed an issue I have with Claude. I'll check it out!