On-device models are the future. Users prefer them. No privacy issues. No dealing with connectivity, tokens, or changes to vendors implementations. I have an app using Foundation Model, and it works great. I only wish I could backport it to pre macOS 26 versions.

▲

raw_anon_1111 2 hours ago | parent | next [-]

Users don’t care about “privacy”. If they did, Meta and Alphabet wouldn’t be worth $1T+.

Users really don’t matter at all. The revenue for AI companies will be B2B where the user is not the customer - including coding agents. Most people don’t even use computers as their primary “computing device” and most people are buying crappy low end Android phones - no I’m not saying all Android phones are crappy. But that’s what most people are buying with the average selling price of an Android phone being $300.

▲

barelysapient 2 hours ago | parent | next [-]

Different users. Many people care about privacy and aren’t using Meta products. And many businesses care about it too and have information policies to protect their IP.

▲

amelius an hour ago | parent | next [-]

> Different users. Many people care about privacy and aren’t using Meta products.

Yeah but if they can rake in 100x as much by making products for people who don't care about privacy, then why spend time developing stuff for people who care?

There is still a small market left, of course, but that market will not have the billions of R&D behind it.

▲

raw_anon_1111 2 hours ago | parent | prev [-]

70% of the world’s population use at least one Meta property at least once per day. How many of the other 30% are too poor/young/computer illiterate to be part of an addressable market?

Every company has dozens of SaaS products that store their business critical information. Amazon installs Office on each computer, Slack (they were moving away from Chime when I left), and the sales department uses SalesForce - SA’s and Professional Services (former employee).

The addressable market of even companies that care about privacy is not a large addressable market. How long will it be before computers become cheap enough that can run even GPT 4 level LLMs that companies will give it to all of their developers?

▲

JambalayaJimbo an hour ago | parent | next [-]

The banking industry absolutely does care about privacy of their business data btw. We do use tools like Confluence but they're all hosted in our own data centers.

	▲	raw_anon_1111 an hour ago \| parent [-]
		And Capital One and Goldman Sachs are both hosted on AWS…

▲

innagadadavida an hour ago | parent | prev [-]

These are all great statistics, but how do you explain ClawdBot explosion. Even in lower income countries like China. So much demand that Apple can’t keep up production of Mac Minis. Why aren’t these folks going towards cloud solutions? Is it cost or is there some consideration for having more control over their data?

	▲	zozbot234 an hour ago \| parent \| next [-]
		ClawBot doesn't generally run the model locally, it just talks to remote APIs. No different than any other agentic harness. You could run a local model on the same Mac Mini as your agent, but it wouldn't be very smart and many agentic tasks around computer GUI/browser use, etc. would be out of reach.
	▲	raw_anon_1111 an hour ago \| parent \| prev [-]
		And people using Clawdbot are still not using local inference for the most part… They aren’t buying high end $2000+ Mac Minis.

▲

abu_ameena 2 hours ago | parent | prev | next [-]

I see it as a long-term tradeoff on user freedom. You pay upfront for a capable hardware, you get your services running locally (you don’t pay subscriptions). Or you buy cheap hardware, you still need the same services “running in some cloud” for $X monthly. X goes up depending on the corporate bottom-line

▲

raw_anon_1111 2 hours ago | parent [-]

In the history of cloud computing, prices have mostly only come down especially as inference becomes a commodity. Realistically, just looking at Mac prices, the cost of a computer with decent local inference would be around $6000 per person.

The world is not moving back to on prem.

▲

esseph an hour ago | parent [-]

> The world is not moving back to on prem.

Lol, you should tell my customers (that are moving back on prem) that!

You should also tell Microsoft, who just yesterday said they are going back to focusing on local apps.

▲

raw_anon_1111 an hour ago | parent [-]

Your customers are an anecdote, now compare that to the publicly reported numbers from AWS, GCP and Azure where they all say the only thing keeping them from growing more is the chip shortage.

▲

esseph an hour ago | parent [-]

Oh I'm sure they'll continue to have some cloud services, no doubt. But look at VMware for example, even after the insane price increases. Nutanix also seems to be doing quite well. I'm seeing a fair amount of on-prem bare metal k8s too.

	▲	raw_anon_1111 38 minutes ago \| parent [-]
		Again - anecdotes is not data. We have data. That would be about as silly as me citing my own experience as proof that “everyone is moving to AWS” when I work for a company that is exclusively an AWS partner consulting company.

▲

DesiLurker an hour ago | parent | prev [-]

you are missing a but 'given a choice' disclaimer. Meta is pretty much a monopoly in social space. So is Android. given a choice people will absolutely gravitate towards not-always-snooping device. most people with resources anyway, who matter for the AI adoption.

Oh an wait till ad companies start selling your healthcare data and you will see how fast things turn 'given a choice'.

	▲	raw_anon_1111 an hour ago \| parent [-]
		People A) don’t have to use Meta and B) do have a choice between not using a mobile phone by an ad tech company.

▲

thefourthchime 15 minutes ago | parent | prev | next [-]

Maybe some more distant future. For me, I'm still struggling with the hallucinations and screw-ups that the state-of-the-art models give me.

▲

jesse23 29 minutes ago | parent | prev | next [-]

Yes so far do we have a working practice that, with a given local mode, any infra we could use, that provide a good practice that can leverage it for local task?

▲

testing22321 2 hours ago | parent | prev [-]

I see all these LLM posts about if a certain model can run locally on certain hardware and I don’t get it.

What are you doing with these local models that run at x tokens/sec.

Do you have the equivalent of ChatGPT running entirely locally? What do you do with it? Why? I honestly don’t understand the point or use case.

	▲	svachalek 23 minutes ago \| parent \| next [-]
		1. There are small local models that have the capabilities of frontier models a year ago 2. They aren't harvesting your data for government files or training purposes 3. They won't be altered overnight to push advertising or a political agenda 4. They won't have their pricing raised at will 5. They won't disappear as soon as their host wants you to switch
	▲	samuel 2 hours ago \| parent \| prev \| next [-]
		Chat is certainly an option, but the real deal are agents, which have access to way more sensitive information.
	▲	dec0dedab0de an hour ago \| parent \| prev [-]
		most of the llm tooling can handle different models. Ollama makes it easy to install and run different models locally. So you can configure aider or vscode or whatever you're using to connect to chatgpt to point to your local models instead. None of them are as good as the big hosted models, but you might be surprised at how capable they are. I like running things locally when I can, and I also like not worrying about accidentally burning through tokens. I think the future is multiple locally run models that call out to hosted models when necessary. I can imagine every device coming with a base model and using loras to learn about the users needs. With companies and maybe even households having their own shared models that do heavier lifting. while companies like openai and anhtropic continue to host the most powerful and expensive options.