▲ | surround 5 days ago | |||||||||||||||||||||||||||||||
> The betting markets were not impressed by GPT-5. I am reading this graph as "there is a high expectation that Google will announce Gemini-3 in August", and not as "Gemini 2.5 is better than GPT-5". This is an incorrect interpretation. The benchmark which the betting market is based upon currently ranks Gemini 2.5 higher than GPT-5. | ||||||||||||||||||||||||||||||||
▲ | theahura 5 days ago | parent | next [-] | |||||||||||||||||||||||||||||||
EDIT: I updated the article to account for this perspective. ------ This can't be right -- they're using LMArena without style control to resolve the market, and GPT-5 is ahead right? (https://lmarena.ai/leaderboard/text/overall-no-style-control) > This market will resolve according to the company which owns the model which has the highest arena score based off the Chatbot Arena LLM Leaderboard (https://lmarena.ai/) when the table under the "Leaderboard" tab is checked on August 31, 2025, 12:00 PM ET. > Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/leaderboard/text with the style control off will be used to resolve this market. > If two models are tied for the top arena score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order (e.g. if both were tied, "Google" would resolve to "Yes", and "xAI" would resolve to "No") > The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | JimDabell 5 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||
> This is an incorrect interpretation. The benchmark which the betting market is based upon currently ranks Gemini 2.5 higher than GPT-5. You can see from the graph that Google shot way up from ~25% to ~80% upon the release of GPT-5. Google’s model didn’t suddenly get way better at any benchmarks, did it? | ||||||||||||||||||||||||||||||||
|