Remix.run Logo
qnleigh 5 days ago

This model is breaking records on my benchmark of choice, which is 'the fraction of Hacker News comments that are positive.' Even people who avoid Google products on principle are impressed. Hardly anyone is arguing that ChatGPT is better in any respect (except brand recognition).

ipsum2 5 days ago | parent | next [-]

Chatgpt 5.2 thinking is significantly better quality for most knowledge work, but it trades off in speed.

energy123 5 days ago | parent | next [-]

That has been my experience. Primarily because it is allowed to expend far more test-time tokens than Gemini 3.0 Pro to solve the same prompt.

eli 5 days ago | parent | prev [-]

And GPT costs 4x as much

Palmik 4 days ago | parent | prev | next [-]

No offense, but that seems like a poor benchmark. These initial vibe checks are easily swayed by personal brand biases.

awestroke 4 days ago | parent | next [-]

The brand bias is heavily against Google, not in Googles favor

Palmik 4 days ago | parent [-]

In context of AI I'm mostly seeing anti-OpenAI pro-Google bias.

clarkmoreno 4 days ago | parent [-]

Facts. These HN threads are half astroturfing and paid shills. Near impossible to decifer authentic takes that are not actual colleagues or people IRL

qnleigh 4 days ago | parent | prev [-]

Fair. No benchmark is perfect.

I do pay special attention to what the most negative comments say (which in this case are unusually positive). And people discussing performance on their own personal benchmarks.

Simon321 4 days ago | parent | prev [-]

i don't know, chat gpt seems to hallucinate a lot less