Google's Pro service (no idea about ultra and I have no intention to find out) is riddled with 429s. They have generous quotas for sure, but they really give you very low priority. For example, I still dont have access to Gemini 3.1 from that endpoint. It's completely uncharacteristic of Google.

I analyzed 6k HTTP requests on the Pro account, 23% of those were hit with 429s. (Though not from Gemini-CLI, but from my own agent using code assist). The gemini-cli has a default retry backoff of 5s. That's verifiable in code, and it's a lot.

I dont touch the anti-gravity endpoint, unlike code-assist, it's clear that they are subsidizing that for user acquisition on that tool. So perhaps it's ok for them to ban users form it.

I like their models, but they also degrade. It's quite easy to see when the models are 'smart' and capacity is available, and when they are 'stupid'. They likely clamp thinking when they are capacity strapped.

Yes the models are smart, but you really cant "build things" despite the marketing if you actively beat back your users for trying. I spent a decade at Google, and it's sad to see how they are executing here, despite having solid models in gemini-3-flash and gemini-3.1

▲

gck1 3 hours ago | parent | next [-]

> Yes the models are smart, but you really cant "build things" despite the marketing if you actively beat back your users for trying

I think this is the most important takeaway from this thread and at some point, this will end up biting Google and Anthropic back.

OpenAI seems to have realized this and is actively trying to do the opposite. They welcomed OpenCode the same day Anthropic banned them, X is full of tweets of people saying codex $20 plan is more generous than Anthropic's $200 etc.

If you told me this story a year ago without naming companies, I would tell you it's OpenAI banning people and Google burning cash to win the race.

And it's not like their models are winning any awards in the community either.

▲

lukeschlather 3 hours ago | parent | next [-]

My impression is there's a definite shortage of GPUs, and if OpenAI is more reliable it's because they have fewer customers relative to the number of GPUs they have. I don't think Google is handing out 429s because they are worried about overspending; I think it's because they literally cannot serve the requests.

	▲	gck1 2 hours ago \| parent [-]
		This sounds very plausible. OpenAI has hoarded 40% of world's RAM supply, which they likely have no use for other than to starve competition. They (or other competitors) could be utilizing the same strategy for other hardware. Which is worrying, because if this continues, and if Google, who has GCP is struggling to serve requests, there's no telling what's going to happen with services like Hetzner etc.

▲

tom_m 3 hours ago | parent | prev | next [-]

You can build plenty with Google ai pro plan and Antigravity. Yea there's some limits that should be even higher, but you can still build stuff.

▲

mannanj 3 hours ago | parent | prev [-]

It's unfortunate though that they lie and deceive by having a name called "Open"AI when they are in fact "Closed". And the whole non-profit to profit and Microsoft deals are just untrustable and unethical.

They also actively employ dark strategies in cooperation with CIA and who knows when they will pull the rug under you again.

Do you really trust a foundational rotten group of people who avoid accountability?

▲

gck1 3 hours ago | parent [-]

I don't know what it's called when something becomes an irony and then this irony becomes an irony itself, but that's what's up with OpenAI today. On one hand, they started this 'we're closing things down because safety' line, they normalized $200/mo subscriptions, but now they're becoming the most open AI company between the big 3. Their tooling is open source, they're lenient on their quotas on lower plans, and their allowance of third party integrations is also unique.

I would still consider OpenAI naming incorrect, but between the 3, they kind of are, open.

	▲	2 hours ago \| parent [-]
		[deleted]

▲

harshitaneja 30 minutes ago | parent | prev | next [-]

Just adding for context that I use Gemini Ultra and across all models from Gemini 3.1 Pro to Claude Opus 4.6, I have never hit 429s as well as hitting model quota limits is incredibly rare and only happens if I am trying to run 3 projects at once. While not the biggest agentic coding fan, I have been toying with them and have been running it for at least 7-8 hours a day if not longer.

▲

tempaccount420 3 hours ago | parent | prev | next [-]

I'm guessing at least 50% of the "users" of Antigravity are actually OpenCode users exploiting the oauth and endpoint. Must be infuriating to them if they're subsidizing it.

The OpenCode plugin (8.7k stars btw!) even advertises "Multi-account support — add multiple Google accounts, auto-rotates when rate-limited"[1]

[1] https://github.com/NoeFabris/opencode-antigravity-auth/blob/...

▲

oofbey 3 hours ago | parent | prev [-]

I’ve often suspected these models of getting dumber when the service is under high load. But I’ve never seen actually measured results or proof. Anybody know of real published data here?

	▲	transcriptase 2 hours ago \| parent \| next [-]
		ChatGPT was brutal for it a couple years ago. You could tell when it would go into “lazy mode” during peak usage periods. Suddenly instead of writing the code you asked for it would give some generic bullet points telling you to find a library to do what you asked for and read the documentation.
	▲	forgotTheLast 2 hours ago \| parent \| prev [-]
		Not exactly what you're looking for but https://news.ycombinator.com/item?id=46810282