Remix.run Logo
geeky4qwerty 7 hours ago

I'm afraid the music may be slowly fading at this party, and the lights will soon be turned on. We may very well look back on the last couple years as the golden era of subsidized GenAI compute.

For those not in the Google Gemini/Antigravity sphere, over the last month or so that community has been experiencing nothing short of contempt from Google when attempting to address an apparent bait and switch on quota expectations for their pro and ultra customers (myself included). [1]

While I continue to pay for my Google Pro subscription, probably out of some Stockholm Syndrome, beaten wife level loyalty and false hope that it is just a bug and not Google being Google and self-immolating a good product, I have since moved to Kiro for my IDE and Codex for my CLI and am as happy as clam with this new setup.

[1] https://github.com/google-gemini/gemini-cli/issues/24937

dgellow 6 hours ago | parent | next [-]

For what it’s worth, that was pretty obvious from the get go it wasn’t a realistic long term deal. I’ve been building all the libraries I hoped existed over the past 1-2y to have something neat to work with whenever the free compute era ends. I feel that’s the approach that makes sense. Take the free tokens, build everything you would want to exist if you don’t have access to the service anymore. If it goes away you’re back to enjoying writing code by hand but with all the building blocks you dreamt of. If it never goes away, nothing wasted, you still have cool libs

asdfasgasdgasdg 6 hours ago | parent | prev | next [-]

So, antigravity will definitely quickly eat up your pro quota. You can run out of it in an hour (at least on the $20/mo plan) and then you'll be waiting five days for it to refresh.

However, I've found that the flash quota is much more generous. I have been building a trio drive FOC system for the STM32G474 and basically prompting my way through the process. I have yet to be able to run completely out of flash quota in a given five hour time window. It is definitely completing the work a lot faster than I could do myself -- mainly due to its patience with trying different things to get to the bottom of problems. It's not perfect but it's pretty good. You do often have to pop back in and clean up debris left from debugging or attempts that went nowhere, or prompt the AI to do so, but that's a lot easier than figuring things out in the first place as long as you keep up with it.

I say this as someone who was really skeptical of AI coding until fairly recently. A friend gave me a tutorial last weekend, basically pointing out that you need to instruct the AI to test everything. Getting hardware-in-loop unit tests up and running was a big turning point for productivity on this project. I also self-wired a bunch of the peripherals on my dev board so that the unit tests could pretend to be connected to real external devices.

I think it helps a lot that I've been programming for the last twenty years, so I can sometimes jump in when it looks like the AI is spinning its wheels. But anyway, that's my experience. I'm just using flash and plan mode for everything and not running out of the $20/mo quota, probably getting things done 3x as fast as I could if I were writing everything myself.

1970-01-01 6 hours ago | parent | prev | next [-]

Lights on = Ads in your output. EOY latest; they can't keep kicking the massive costs down the road.

fooster 6 hours ago | parent | next [-]

Where is your evidence of this "massive cost"? Inference is massively profitable for both anthropic and openai. Training is not.

kibwen 6 hours ago | parent | next [-]

The evidence is that quotas exist, as seen here, and are low enough that people are hitting them regularly. When was the last time you hit your quota of Google searches? When was the last time you hit your quota of StackOverflow questions? When was the last time you hit your quota of YouTube videos? Any service will rate limit abuse, but if abuse is indistinguishable from regular use from the provider's perspective, that's not a good sign.

jerf 6 hours ago | parent | next [-]

It's also kind of interesting that they don't think they can do what an economy would normally do in this situation, which is raise prices until supply matches. Shortages generally imply mispricing.

There's a lot of angles you take from that as a starting point and I'm not confident that I fully understand it, so I'll leave it to the reader.

caminante 6 hours ago | parent | prev [-]

Great point.

The parent's argument is that the marginal cost of inference is minimal. However, the fundamental flaw is that he's separating inference from the high cost frontier models. It's a cross-subsidy that can't be ignored.

bachmeier 3 hours ago | parent [-]

Without any insider knowledge on the economics of these companies, I suspect it's that the amount of infrastructure you have to build is determined by peak usage rather than average usage. If peak usage is much higher for a small part of one day a week (say on Monday morning as software developers across the US get back to work) the cost of fulfilling demand at all times can be insane. That's why companies are implementing batch/standard/priority pricing for the API.

weakfish 6 hours ago | parent | prev | next [-]

This article convinced me otherwise https://www.wheresyoured.at/the-subprime-ai-crisis-is-here/

Narciss 5 hours ago | parent [-]

This is a great article, thanks for sharing

scrollop 6 hours ago | parent | prev | next [-]

The majority of accounts are free - these are profitable?

IMO they need as many users before their IPO - then the changes will really begin.

quikoa 5 hours ago | parent | prev | next [-]

Inference for API or subscriptions? There is a massive price difference between the two.

wesammikhail 6 hours ago | parent | prev [-]

source?

KaoruAoiShiho 6 hours ago | parent [-]

After googling https://www.reddit.com/r/singularity/comments/1psesym/openai...

wesammikhail 5 hours ago | parent | next [-]

I've seen sources like this before. It's all hearsay and promo. I was asking for any publicly available verifiable information regarding the cost of inference at scale. I haven't seen any such info personally which is why I asked.

I'm dying to see S-1 filing for Anthropic or OpenAI. I don't actually think inference is as cheap as people say if you consider the total cost (hardware, energy, capex, etc)

KaoruAoiShiho 3 hours ago | parent [-]

Well they're not public yet so you'll have to put up with rumors. But the numbers are available for companies like DeepSeek say they have an 80% profit margin, so it stands to reason OAI etc would do similar numbers considering they charge much more.

caminante an hour ago | parent [-]

AFAIK,

1. the 80% margin from 2025 was theoretical,

2. they're relying on distillation/synthetic data for training,

3. and have been very opaque about cross-subsidization of R&D with their models.

The distillation alone adds a big asterisk for comparisons.

KaoruAoiShiho 25 minutes ago | parent [-]

Talking nonsense.

caminante 6 hours ago | parent | prev [-]

>OpenAI's compute margin, referring to the share of revenue excluding the costs of running its AI models for paying users

Huh?

The reddit summary comment makes no sense. How are they getting revenues without ads or paying customers?

"After" makes more sense.

FTA:

>The company has yet to show a profit and is searching for ways to make money to cover its high computing costs and infrastructure plans.

jerf 6 hours ago | parent | prev [-]

Ads do not pay enough to cover AI usage. People see the big numbers Google and Facebook make in ads and forget to divide the number by the number of people they serve ads to, let alone the number of ads they served to get to that per-user number. You can't pay for 3 cents of inference with .07 cents of revenue.

You also can't put ads in code completion AIs because the instant you do the utility to me of them at work drops to negative. Guess how much money companies are going to pay for negative-value AIs? Let's just say it won't exactly pay for the AI bubble. A code agent AI puts an ad for, well, anything and the AI accidentally puts it into code that gets served out to a customer and someone's going to sue. The merits of the case won't matter, nor the fact the customer "should have caught it in review", the lawsuit and public reputation hit (how many people here are reading this and salivating at the thought of being able to post an angrygram about AIs being nothing but ad machines?) still cost way too much for the AI companies creating the agents to risk.

rr808 5 hours ago | parent | prev | next [-]

I still remember those $3 uber rides.

faangguyindia 6 hours ago | parent | prev | next [-]

Ultimately we'll find more efficient techniques and hardware and AI companies will end up owning Nuclear Power Stations and continue providing models capable of 10x of what they are now.

Valuation have already reached point where these companies can run their nuclear power station, fund developement of new hardware and techniques and boost capabilities of their models by 10x

Root_Denied 3 hours ago | parent | next [-]

There's not enough nuclear to go around, and the approval/permitting process for new nuclear power plants is nothing to sneeze at, both in terms of time and cost.

That's also ignoring that nuclear power plants also consume quite a bit of water, which may be a more difficult bottleneck in and of itself even without trying to add nuclear into the mix.

croes 4 hours ago | parent | prev [-]

Too bad the models collapse because the lack of nee good training data.

How many companies will generate profit in the end, what will happen with all those power stations and data centers ?

elephanlemon 5 hours ago | parent | prev | next [-]

IMO we are currently in the ENIAC era of LLMs. Perhaps there will be a brief moment where things get worse, but long term the cost of these things will go way down.

pier25 5 hours ago | parent | next [-]

Cost will probably go down but nobody knows when or how. It might take 10 years for all we know as training costs have only been rising.

A huge difference is early computers were not subsidized. It took decades until most people could afford to own a computer at home.

croes 4 hours ago | parent | prev [-]

Or we are in the early Netflix era where profit wasn’t as important as customer growth.

rzkyif 6 hours ago | parent | prev | next [-]

Fellow annoyed Google AI Pro subscriber here!

Can confirm, I initially enjoyed the 5-hour limits on Gemini CLI and Antigravity so much that I paid for a full year, thinking it was a great decision

In the following months, they significantly cut the 5-hour limits (not sure if it even exists anymore), introduced the unrealistically bad weekly limit that I can fully consume in 1-2 hour, introduced the monthly AI credits system, and added ads to upgrade to Ultra everywhere

At the very least the Gemini mobile app / web app is still kinda useful for project planning and day-to-day use I guess. They also bumped the storage from 2TB to 5TB, but I don't even use that

stavros 6 hours ago | parent | next [-]

It should be illegal to change the terms of the subscription mid-period. If you paid for the full year, you should get that plan for the whole year. I don't understand how it's ok for corporations to just change the terms mid-way, and we just have to accept it.

bachmeier 4 hours ago | parent | next [-]

> It should be illegal to change the terms of the subscription mid-period

Unfortunately, at least for those of us in the US, there isn't legally much that can be done. It's simply not possible to make a contract that would obligate a company to fulfill its promises on this type of sale.

bobmcnamara 6 hours ago | parent | prev [-]

T&C?

stavros 4 hours ago | parent [-]

I'm sure the T&C say something like "you're going to pay us money, and we reserve the right to give you something for it, or maybe nothing, and you should thank us for the privilege".

logicchains 4 hours ago | parent | prev | next [-]

It's the exact same thing they did with Google BigQuery, which initially was an absolutely amazing piece of technology before they smothered it with more and more limits and restrictions. It's like they're putting SREs first, customers second.

nprateem 6 hours ago | parent | prev [-]

Don't bother upgrading to ultra. It's also now easy to burn all your credits where in Jan it was almost impossible

palata 6 hours ago | parent | prev [-]

> We may very well look back on the last couple years as the golden era of subsidized GenAI compute.

Looks like enshittification on steroids, honestly.

omosubi 6 hours ago | parent [-]

Getting $5000 worth of product essentially free and then being told to pay is not enshittification.

Chaosvex 5 hours ago | parent | next [-]

Another take: perhaps they shouldn't have been pricing it at that point if they weren't capable of actually delivering.

knollimar 3 hours ago | parent | prev | next [-]

It absolutely is. Loss leading is their fault and anticompetitive.

quikoa 5 hours ago | parent | prev | next [-]

The cost for AI companies might be $5000 but the "essentially free" could be close to the limit of what people are willing to spend. If that's the case then enshittification will continue and/or many AI companies will never be profitable.

tvbusy 5 hours ago | parent | prev | next [-]

We have seen this before. Companies using VC money to take over the market and then increase prices. In the end, we're worse off without these scumbags but some will still sing that we got free service do it's bot enshitification.

zzzoom 5 hours ago | parent | prev [-]

It's predatory pricing.