feels like an insult to readers to try to pretend that their revenue per month is comparable to google or apples growth when the funding is absurdly different, not to mention inflation itself.

I am very much onboard with AI within my workflow. I just don't really see a future where openai/anthropic are the absolute front runners for devs though. Maybe OpenAI does just have the better vision by targeting the general public instead, and just competing to become the next google before google can just stay google?

What is their next step to ensure local models never overtake them? If i could use opus 4.6 as a local model isntead and wrap it in someone else's cli tool, i 100% do it today. are the future model's gonna be so far beyond in capability that this sounds foolish? the top models are more than enough to keep up with my own features before i can think of more... so how do they stretch further than that?

A side note i keep thinking about, how impossible is a world where open source base models are collectively trained similar to a proof of work style pool, and then smaller companies simply spin off their own finishing touches or whatever based on that base model? am i thinking of thinks too simplistically? is this not a possibility?

▲

simonjgreen 7 hours ago | parent | next [-]

Anthropic is definitely gaining ground over OpenAI in the business world. Cowork is the absolute hotness right now, and even prompted MSFT to drop their own variant yesterday

▲

strongpigeon 7 hours ago | parent | next [-]

Ask anybody you know that works in Big Tech. They're all pushing hard for Claude Code adoption.

▲

operatingthetan 6 hours ago | parent | prev [-]

Codex and Gemini CLI seem 1-2 months behind Claude Code. They will catch up. This race will eventually be won by whoever can come up with the cheapest compute.

▲

a1studmuffin 6 hours ago | parent [-]

And that's a dangerous game because the cheaper compute gets, the more likely consumers are to self-host rather than pay a subscription.

▲

ds2df 6 hours ago | parent | next [-]

Apple could figure out a way to neatly package it into their ecosystem.

▲

winrid 6 hours ago | parent | prev [-]

Not really. Most people won't self host.

▲

jonah 5 hours ago | parent [-]

The general public will self-host it's built in to your next phone or laptop straight out of the box or maybe from the App Store.

▲

delecti 4 hours ago | parent | next [-]

I agree that that's what it would take, but compute would need to get very cheap for it to be feasible to keep models running locally. That's an awful lot of memory to have just sitting with the model running in it.

▲

winrid 5 hours ago | parent | prev [-]

True. I was thinking more of power users. Do you think Opus level capabilities will run on your average laptop in a year? I think that's pretty far away if ever.

	▲	zozbot234 5 hours ago \| parent [-]
		You can demonstrate "running" the latest open Kimi or GLM model on a top-of-the-line laptop at very low throughput (Kimi at 2 tok/s, which is slow when you account for thinking time) today, courtesy of Flash-MoE with SSD weights offload. That's not Opus-like, it's not an "average" laptop and it's not really usable for non-niche purposes due to the low throughput. But it's impressive in a way, and it does give a nice idea of what might be feasible down the line.

▲

miki123211 6 hours ago | parent | prev | next [-]

> how impossible is a world where open source base models are collectively trained similar to a proof of work style pool

Current multi-GPU training setups assume much higher bandwidth (and lower latency) between the GPUs than you can get with an internet connection. Even cross-datacenter training isn't really practical.

LLM training isn't embarrassingly parallel, not like crypto mining is for example. It's not like you can just add more nodes to the mix and magically get speedups. You can get a lot out of parallelism, certainly, but it's not as straightforward and requires work to fully utilize.

▲

thomasahle 7 hours ago | parent | prev | next [-]

It's hard to train models in the open. All the big players are using lots of "dodgy" training data. Like books, video, code, destinations. If you did that in the open, the lawyers would shut you down.

▲

ravenstine 7 hours ago | parent | prev | next [-]

Though I think these companies are wildly overvalued, I don't see LLMs as a service going away in the future. The value in OpenAI is that it provides extra compute, data access, etc. My money is on local AI becoming more of a thing, while services like OpenAI still exist for local AIs to consult with. If a local model can somehow know that it's out of it's depth on a question/prompt, it can ask an OpenAI model if it's available, but otherwise still work locally if OpenAI fails to respond or goes out of business. To me that makes a lot more sense than the future being either-or.

	▲	clhodapp 6 hours ago \| parent [-]
		Models not being able to reliably know if they are out of their depth is a foundational limitation of the currently generation of models, though. Best they can do is to somewhat reliably react to objective signals that they've failed at something (like test failures).

▲

Aurornis 6 hours ago | parent | prev | next [-]

> What is their next step to ensure local models never overtake them?

As someone who experiments with local models a lot, I don’t see this as a threat. Running LLMs on big server hardware will always be faster and higher quality than what we can fit on our laptops.

Even in the future when there are open weight models that I can run on my laptop that match today’s Opus, I would still be using a hosted variant for most work because it will be faster, higher quality, and not make my laptop or GPU turn into a furnace every time I run a query.

▲

zozbot234 6 hours ago | parent [-]

If your laptop overheats when you push your GPU, you can buy purpose-built "gaming" laptops that are at least nominally intended to sustain those workloads with much better cooling. Of course, running your inference on a homelab platform deployed for that purpose, without the thermal constraints of a laptop, is also possible.

	▲	Aurornis 5 hours ago \| parent [-]
		I didn't say it overheats. It gets hot and the fans blow, neither of which are enjoyable. MacBook Pro laptops are preferred over "gaming" laptops for LLM use because they have large unified memory with high bandwidth. No gaming laptop can give you as much high-bandwidth LLM memory as a MacBook Pro or an AMD Strix Halo integrated system. The discrete gaming GPUs are optimized for gaming with relatively smaller VRAM.

▲

mlsu 6 hours ago | parent | prev [-]

You can host a website on any rackmount server for pennies compared to AWS. But people still use AWS.

The market for local models is always gonna be a small niche, primarily for the paranoid.

	▲	lukan 6 hours ago \| parent \| next [-]
		"The market for local models is always gonna be a small niche, primarily for the paranoid." Have you ever heard of industrial espionage? Pr privacy regulations? Or military applications? (Also the US military runs claude as a local model)
	▲	FpUser 6 hours ago \| parent \| prev [-]
		>"But people still use AWS" I do not, I self host. My current client is also got rid from AWS packing up nice savings as a result