Remix.run Logo
TechRemarker 4 hours ago

Heard all the news how Gemini 3 is passing everyone on benchmarks, so quickly tested and still find it a far cry from ChatGPT in real world use when testing questions on both platforms. But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini. But glad to see competition since certainly don't want only one winner in this race.

hodgehog11 3 hours ago | parent | next [-]

That's really fascinating. Every real world use case I've tried on Gemini (especially math-related) absolutely slaughtered the performance of ChatGPT in speed and quality, not even close. As an Android user, the Gemini app is also far superior, since the ChatGPT app still doesn't properly display math equations, among plenty of other bugs.

dudeinhawaii 2 hours ago | parent | next [-]

I have to agree with you but I'll remain a skeptic until the preview tag is dropped. I found Gemini 2.5 Pro to be AMAZING during preview and then it's performance and quality unceremoniously dropped month after month once it went live. Optimizations in favor of speed/costs no doubt but it soured me on jumping ship during preview.

Anthropic pulled something similar with 3.6 initially, with a preview that had massive token output and then a real release with barely half -- which significantly curtails certain use cases.

That said, to-date, Gemini has outperformed GPT-5 and GPT5.1 on any task I've thrown at them together. Too bad Gemini CLI is still barely useful and prone to the same infinite loop issues that have plagued it for over a year.

I think Google has genuinely released a preview of a model that leapfrogs all other models. I want to see if that is what actually makes it to production before I change anything major in my workflows.

verdverm 3 hours ago | parent | prev | next [-]

It's generally anecdotal and vibes when people make claims that some AI is better than another for things they do. There are too many variables and not enough eval for any of it to hold water imo. Personal preferences, experience, brand loyalty, and bias at play too

it's contemporary vim vs emacs at this point

hodgehog11 an hour ago | parent [-]

I get what you're saying because this is typically true (this is a strong motivator for my current research) but I don't think it applies here and OpenAI seems to agree with me. Some cases are clear: GPT-5 is clearly better than Llama 3 for example. If there is a sizeable enough difference across virtually all evals, it is typically clear that one LLM is a stronger performer than another.

Experiences aside, Gemini 3 beats GPT-5 on enough evals that it seems fair to say that it is a better model. This appears in line with public consensus, with a few exceptions. Those exceptions seem to be centered around search.

bdhtu 3 hours ago | parent | prev | next [-]

What do you mean? It renders LaTex fine on Android.

hodgehog11 3 hours ago | parent | next [-]

Some LaTeX, but not all, especially for larger equations. I will admit it has gotten a lot better in recent updates, since it seemed thoroughly broken for quite a while in its early days.

null_deref 3 hours ago | parent | prev [-]

I had a problem where ChatGPT rendered math to me from right to left. Sure thing YMMV

croes 3 hours ago | parent | prev | next [-]

One might think that benchmarks do not say much about individual usage and that an objective assessment of the performance of AIs is difficult.

At least, thanks to the hype, RAM and SSDs are becoming more expensive, which eats up all the savings from using AI and the profits from increased productivity /s?

kristofferR 3 hours ago | parent | prev [-]

Try doing some more casual requests.

When I asked both ChatGPT 5.1 Extended Thinking and Gemini 3 Pro Preview High for best daily casual socks both responses were okay and had a lot of the same options, but while the ChatGPT response included pictures, specs scraped from the product pages and working links, the Gemini response had no links. After asking for links, Gemini gave me ONLY dead links.

That is a recurring experience, Gemini seems to be supremely lazy to its own detriment quite often.

A minute ago I asked for best CR2032 deal for Aqara sensors in Norway, and Gemini recommended the long discontinued IKEA option, because it didn't bother to check for updated information. ChatGPT on the other hand actually checked prices and stock status for all the options it gave me.

BeetleB 2 hours ago | parent | prev | next [-]

> But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini.

Yes, the ChatGPT experience is much better. No, Gemini doesn't need to make a better product to take market share.

I've never had the ChatGPT app. But my Android phone has the Gemini app. For free, I can do a lot with it. Granted, on my PC I do a lot more with all the models via paid API access - but on the phone the Gemini app is fine enough. I have nothing to gain by installing the ChatGPT app, even if it is objectively superior. Who wants to create another account?

And that'll be the case for most Android users. As a general hint: If someone uses ChatGPT but has no idea about gpt-4o vs gpt-5 vs gpt-5.1 etc, they'll do just fine with the Gemini app.

Now the Gemini app actually sucks in so many ways (it doesn't seem to save my chats). Google will fix all these issues, but can overtake ChatGPT even if they remain an inferior product.

It's Slack vs Teams all over again. Teams one by a large margin. And Teams still sucks!

karmasimida 2 hours ago | parent | prev | next [-]

Well I have been using Gemini and ChatGPT side by side for over 6 months now.

My experience is Gemini has significantly improved its UX and performs better that requires niche knowledge, think of some ancient gadgets that have been out of production for 4-5 decades. Gemini can produce reliable manuals, but ChatGPT hallucinates.

UX wise ChatGPT is still superior and for common queries it is still my go to. But for hard queries, I am team Gemini and it hasn’t failed me once

binarymax 3 hours ago | parent | prev | next [-]

Benchmaxxing galore by lots of teams in this space.

emp17344 3 hours ago | parent [-]

I think it’s entirely possible that AI actually has plateaued, or has reached a point where a jump in intelligence comes at the cost of reliability.

hugh-avherald 3 hours ago | parent | next [-]

I suspect it's reached the point where the distinguishing quality of one model over the others is only observable by true experts -- and only in their respective fields. We are exhausting the well of frontier questions that can be programmatically asked and the answers checked.

hodgehog11 3 hours ago | parent [-]

Absolutely this. Strong disagree that progress is plateauing, merely that gains are harder for the general public to perceive and typically come from more advanced means than simply scaling. Math performance in particular is improving at an uncomfortably rapid pace.

lukan 2 hours ago | parent | prev [-]

AI in general? Not at all. LLM's maybe a little bit, when even Sam Altman said, the progress is logarithmic to the investment. Still, there is progress. And the potential of LLM based agents, where many different models and other technics are mixed in together, we just started to explore.

pohl 2 hours ago | parent | prev | next [-]

I had a similar experience, signing up for the first time to give Gemini a test drive on my side project after a long time using ChatGPT. The latter has a native macOS client which "just works" integrating with Xcode buffers. I couldn't figure out how to integrate Gemini with Xcode quickly enough so I'm resorting to pasting back & forth from the browser. A few of the exchanges I've had "felt smarter" — but, on the whole, it feels like maybe it wasn't as well trained on Swift/SwiftUI as the OpenAI model. I haven't decided one way or another yet, but those are my initial impressions.

doug_durham 2 hours ago | parent | prev | next [-]

I've been a paying high volume user of ChatGPT for a while. I found the transition to Gemini to be seamless. I've been pleasantly surprised. I bounce between the two. I'm at about 60% Gemini, 40% ChatGPT.

kranke155 3 hours ago | parent | prev | next [-]

Gemini comes with the 1.99 Google One plan. So I use that

BeetleB 2 hours ago | parent [-]

Actually, it comes with the free plan. The $1.99 plan doesn't give you any more AI capabilities. Only at the $19.99/mo plan do you get more.

https://one.google.com/about/#compare-plans

xnx 3 hours ago | parent | prev | next [-]

> So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini.

or cheaper/free

lanthissa 3 hours ago | parent | prev | next [-]

they're deep into a redesign of the gemini app, idk when it will be released or if its going to be good, but at least they agree with you and are putting significant resources into fixing it.

tmaly 22 minutes ago | parent [-]

I did notice a bug on the iPhone, even with app background refresh, if the phone shuts off the screen, a prompt that was processing stalls out.

tapoxi 4 hours ago | parent | prev | next [-]

Its really hard to measure these things. Personally I switched to Gemini a few months ago since it was half the cost of ChatGPT (Verizon has a $10/month Google AI package). I feel like I've subconsciously learned to prompt it slightly differently and now using OpenAI products feels disappointing. Gemini tends to give me the answer I expect, Claude follows close behind, I get "meh" results from OpenAI.

I am using Gemini 3 Pro, I rarely use Flash.

golfer 3 hours ago | parent | prev | next [-]

I couldn't even get ChatGPT to let me download code it claimed to program for me. It kept saying the files were ready but refused to let me access or download anything. It was the most basic use case and it totally bombed. I gave up on ChatGPT right then and there.

It's amazing how different people have wildly varying experiences with the same product.

embedding-shape 3 hours ago | parent | next [-]

It's because comparing their "ChatGPT" experience with your "ChatGPT" experience doesn't tell anyone anything. Unless people start saying what models they're using and prompts, the discussions back and forth about what platform is the best provides zero information to anyone.

dudeinhawaii 3 hours ago | parent | prev | next [-]

Did you wait a while before downloading? The links it provides for temporary projects have a surprisingly brief window where you can download them. I've had similar experience when even waiting 1 minute to download the file.

_whiteCaps_ 2 hours ago | parent | prev | next [-]

The same thing happens to me in Claude occasionally. I have to tell it "Generate a tar.gz archive for me to download".

bdbdbdb 3 hours ago | parent | prev [-]

Since LLMs are non deterministic it's not that amazing. You could ask it the same question as me and we could both get very different conversations and experiences

jiggawatts 4 hours ago | parent | prev | next [-]

Curiously, I had the opposite experience, except for Deep Research mode where after the latest update the OpenAI offering has become genuinely amazing. This is doubly ironic because Gemini has direct API access to Google search!

threecheese 3 hours ago | parent | next [-]

It is good, but Pro subscribers get only five per month. After that, it’s a limited version, and it’s not good (normal 5.1 gives more comprehensive answers than DR Limited).

observationist 3 hours ago | parent | prev [-]

Google search is awful. I don't think they can put lipstick on that particular pig and expect anyone to think it's beautiful.

coppsilgold 2 hours ago | parent [-]

I'm sure they give their AI models a superior search than they give to us.

Also if you prompt Google search the right way it's unfortunately still superior to most if not all other solutions in most cases.

par 3 hours ago | parent | prev | next [-]

Yeah, hate to say but for me a big thing is i still couldn't separate my Gemini chats into folders. I had ChatGPT export some profiles and history and moved it into Gemini, and 1) when Gemini gave me answers i was more pleased but 2) Gemini was a bit more rigorous on guard rails, which seems a bit overly cautious. I was asking some pretty basic non-controversial stuff.

machomaster 15 minutes ago | parent | next [-]

If I research anything close to controversial, I use Grok. Its no-censorship attitude is great.

bitpush 2 hours ago | parent | prev [-]

Looks like it is coming.

https://www.androidauthority.com/google-gemini-projects-2-36...

r_lee 3 hours ago | parent | prev | next [-]

I'm confused as well, it hallucinated like crazy

like it seems great, but then it's just bullshitting about what it can do or whatever

potsandpans 2 hours ago | parent | prev | next [-]

What are your primary usecases? Are you mostly using it as a chatbot?

I find gemini excels in multimodal areas over chatgpt and anthropic. For example, "identify and classify this image with meta data" or "ocr this document and output a similar structure in markdown"

j45 3 hours ago | parent | prev | next [-]

Training and gaming for the benchmarks is different than actual use.

mrcwinn 2 hours ago | parent | prev [-]

This is exactly my experience. And it's funny -- this crowd is so skeptical of OpenAI... so they prefer _Google_ to not be evil? It's funny how heroes and villains are being re-cast.