| ▲ | alecco 12 hours ago | |||||||
For SWE it is the same ranking. But if Google's $20/mo plan is comparable to the $100-200 plans for OpenAI and Anthropic, yes they are done. But we'll have to wait a few weeks to see if the nerfed model post-release is still as good. | ||||||||
| ▲ | siva7 11 hours ago | parent | next [-] | |||||||
I have a few secret prompts to test complex reasoning capabilities of new models (in law and medicine). Gemini (2.5 pro) is by a wide margin behind Anthropic (sonnet 4.5 basic thinking) and Openai (pro model) on my own benchmark and I trust my own benchmark more than public leaderboards. So it's the other way around. Google is trying to catch up where the others are. It just doesn't seem so to some because Google undercuts prices and most people don't have own complex problems with a verified solution to test against (so they could see how bad Gemini is in reality) | ||||||||
| ||||||||
| ▲ | 12 hours ago | parent | prev [-] | |||||||
| [deleted] | ||||||||