Remix.run Logo
consumer451 3 hours ago

This is a fun thought experiment. I believe that we are now at the $5 Uber (2014) phase of LLMs. Where will it go from here?

How much will a synthetic mid-level dev (Opus 4.5) cost in 2028, after the VC subsidies are gone? I would imagine as much as possible? Dynamic pricing?

Will the SOTA model labs even sell API keys to anyone other than partners/whales? Why even that? They are the personalized app devs and hosts!

Man, this is the golden age of building. Not everyone can do it yet, and every project you can imagine is greatly subsidized. How long will that last?

tern 3 hours ago | parent | next [-]

While I remember $5 Ubers fondly, I think this situation is significantly more complex:

- Models will get cheaper, maybe way cheaper

- Model harnesses will get more complex, maybe way more complex

- Local models may become competitive

- Capital-backed access to more tokens may become absurdly advantaged, or not

The only thing I think you can count on is that more money buys more tokens, so the more money you have, the more power you will have ... as always.

But whether some version of the current subsidy, which levels the playing field, will persist seems really hard to model.

All I can say is, the bad scenarios I can imagine are pretty bad indeed—much worse than that it's now cheaper for me to own a car, while it wasn't 10 years ago.

FuckButtons an hour ago | parent | prev | next [-]

I can run Minimax-m2.1 on my m4 MacBook Pro at ~26 tokens/second. It’s not opus, but it can definitely do useful work when kept on a tight leash. If models improve at anything like the rate we have seen over the last 2 years I would imagine something as good as opus 4.5 will run on similarly specced new hardware by then.

consumer451 37 minutes ago | parent [-]

I appreciate this, however, as a ChatGPT, Claude.ai, Claude Code, and Windsurf user... who has tried nearly every single variation of Claude, GPT, and Gemini in those harnesses, and has tested all the those models via API for LLM integrations into my own apps... I just want SOTA, 99% of the time, for myself, and my users.

I have never seen a use case where a "lower" model was useful, for me, and especially my users.

I am about to get almost the exact MacBook that you have, but I still don't want to inflict non-SOTA models on my code, or my users.

This is not a judgement against you, or the downloadable weights, I just don't know when it would be appropriate to use those models.

BTW, I very much wish that I could run Opus 4.5 locally. The best that I can do for my users is the Azure agreement that they will not train on their data. I also have that setting set on my claude.ai sub, but I trust them far less.

Disclaimer: No model is even close to Opus 4.5 for coding agents. In my own apps, I process a lot of text/complex context and I use Azure GPT 4.1 for limited llm tasks... but for my "chat with the data" UX, Opus 4.5 all day long. It has tested so superior.

andai 3 hours ago | parent | prev [-]

The real question is how long it'll take for Z.ai to clone it at 80% quality and offer it at cost. The answer appears to be "like 3 months".

consumer451 3 hours ago | parent [-]

This is a super interesting dynamic! The CCP is really good at subsidizing and flooding global markets, but in the end, it takes power to generate tokens.

In my Uber comparison, it was physical hardware on location... taxis, but this is not the case with token delivery.

This is such a complex situation in that regard, however, once the market settles and monopolies are created, eventually the price will be what market can bear. Will that actually create an increase in gross planet product, or will the SOTA token providers just eat up the existing gross planet product, with no increase?

I suppose whoever has the cheapest electricity will win this race to the bottom? But... will that ever increase global product?

___

Upon reflection, the comment above was likely influenced by this truly amazing quote from Satya Nadella's interview on the Dwarkesh podcast. This might be one of the most enlightened things that I have ever heard in regard to modern times:

> Us self-claiming some AGI milestone, that's just nonsensical benchmark hacking to me. The real benchmark is: the world growing at 10%.

https://www.dwarkesh.com/p/satya-nadella#:~:text=Us%20self%2...