Remix.run Logo
benterix 2 days ago

> Browsers and operating systems are increasingly expected to gain access to language models.[0]

Are they?

[0] https://github.com/webmachinelearning/prompt-api/blob/main/R...

stingraycharles 2 days ago | parent | next [-]

I think this is the wrong way. I don’t want my OS or browser to have access to an LLM, but I do want my LLM to have access to a browser or OS (and they already have).

So they should provide an interface to LLMs, disabled by default, enabled when users want it, and that’s it imho.

That also gives me the choice of which LLM provider to use, rather than being locked in whatever LLM Apple decided to do put in their OS.

I want to give Claude access to the stuff Apple Intelligence has access to, for example.

domenicd 2 days ago | parent | prev | next [-]

(I wrote those words originally.)

Wow. I had no idea that people would misinterpret what I was saying in this way. I was not meaning to imply it was an expectation of users or developers. I was meaning it as a statement of what was currently a growing industry trend by OS and browser vendors, of shipping or preparing to ship LMs.

By now the statement could probably be amended from "expected to gain access to" to "shipping with".

I hope the team maintaining the project now makes such an update, since apparently it's confusing so many people!

singron 2 days ago | parent | next [-]

I thought it was clear and am also surprised by the reaction (en-US speaker). "Is/are expected" is generally used as a passive-voiced form of "we/they predict" (obviously without having to specify a specific pronoun). E.g. "It's expected to rain tomorrow" means a weather forecast says it will rain tomorrow and usually not that people want it to rain tomorrow.

I wonder if this phrase has different connotations among other English readers? A lot of these comments are fairly early for US timezones.

wavemode 2 days ago | parent [-]

I don't think US vs. non-US has anything to do with it. It's an ambiguous phrase, whose meaning is usually resolved by context.

"It's expected to rain tomorrow" is a prediction, whereas "students are expected to behave themselves" is an expectation (with consequences, presumably).

In the former case we clearly aren't saying we want it to rain, just that we believe it's likely, whereas in the latter example we are clearly expressing that we do want students to behave.

It's ambiguous because "expect" has two different meanings:

> to consider probable or certain

> to consider reasonable, due, or necessary

benterix 2 days ago | parent | prev [-]

[dead]

concinds 2 days ago | parent | prev | next [-]

Sure. macOS, iOS and Windows have local model APIs for third-party devs. Chrome is trialing it. Firefox uses models to generate alt-text, but no API.

In theory it's useful. If devs can rely on local models, it's more private and decentralized, they don't need to funnel money to AWS or Anthropic. There are low-stakes use cases that only make sense if they're local (available offline) and free.

But in practice I've seen zero adoption of Apple Foundation Models in native apps. I wonder if any Mac/iOS devs have anything to share on this.

dannyw 2 days ago | parent | next [-]

In practice it’s useful too. The local translation in Firefox is quite good, and I love that I can translate pages entirely on my machine; without the contents going to another server.

As for Apple foundational models, I think the issue is more that they’re just not very intelligent or good; maybe WWDC will change that; but if you want to implement LLM functionality, you’re better off either calling an API, or shipping a better small on device model.

pbronez 2 days ago | parent [-]

Yeah I looked into the Apple Foundation models and was surprised at their limited scope. On reflection it made sense though. They’re giving you the small part of the LLM capability surface that (1) can run with good performance on all their hardware and (2) works reliably.

It’s not enough for a chat-first research agent, but it’s definitely enough to unlock features that rely on natural language understanding. Seems like a small thing compared to Claude/ChatGPT and the general hype, but still magic in its own context.

getpokedagain 2 days ago | parent | prev [-]

I don't think thus is what was meant. I don't think they were questioning if OS and browser makers were embedding llm features but rather if people want them.

I find many frustrating. I had an iphone previously and the llm summaries of text messages are what drove me to finally drop ios. I have a family member who is undergoing cancer treatment. I can't explain to you the frustration of seeing wrong text summaries when an llm goes wild hallucinating test results when the actual text simply said taking a test. OS basics and communication should be trustable. Not perhaps hallucinations of a small shitty model.

zamalek 2 days ago | parent | prev | next [-]

AI massively empowers people who are incapable of anything except bikeshedding. It itself is very likely to be a bikeshed (but there are legitimate uses), and it also gives them to power to drone on until they overpower any opposition to their useless ideas.

Everything is increasingly expected to gain bikesheds.

Can't wait for the CVEs.

anthonyrstevens 2 days ago | parent [-]

>> people who are incapable of anything except bikeshedding

The amount of insulting language directed at people who actually have an open mind about AI and AI tooling is frustrating. Can you all just please address the merits of the topic of the post instead of making every AI-related post on HN an excuse to vent about your own particular worldview and insult people who don't necessarily agree?

zamalek 2 days ago | parent [-]

Platform support for AI has as much place in a browser as it does in Notepad. This isn't about being open-minded at all. I have written multiple MCPs, I use it daily, I am not in the crowd who "don't have an open mind." This outright non-feature is a significant source of issues, least of which is fingerprinting.

Make an AI browser extension. Done.

Shoving AI into anything where it can go is not having an open mind about things, it's nothing more shoving AI into anything where it can go.

On the inverse, can you provide a single reason why this API should exist which is isn't something that obviously erupted from an LLM? Again:

> Browsers and operating systems are increasingly expected to gain access to language models.

God help people if they have to copy their prompt from ChatGPT to Claude.

noirscape 2 days ago | parent | prev | next [-]

It's the typical "cart before the horse" kind of corporate tech talk. It's pretty standard if Silicon Valley wants to sell shit that nobody actually wants; they just assume that people will want it, regardless whether or not they actually want it. Most of the tech press is too obsessed with retaining their "access" to actually be critical of this sort of thing, and most of the regular press doesn't care enough to actually investigate.

We've seen this sort of song and dance before, crypto jumps to mind. Remember when social media sites suddenly were all about those hexagonal avatars? Most of this stuff is really in that same vein.

(Which to be clear, users don't want this. AI pushes by pretty much all recent user feedback metrics are largely tiring out users and reek of corporate desperation to sell shit. It's only a very specific subsection of Silicon Valley that wants to stuff AI in everything like this.)

stingraycharles 2 days ago | parent [-]

I think the resentment for Copilot is pretty much universal. People like AI, when it’s not forced upon them.

A lot of these products feel unguided by an “everything must become AI” FOMO movement, rather than actual thoughtful integrations.

PearlRiver a day ago | parent [-]

Stuff like Google Lens is nice. It solves an actual problem (me looking at Japanese and having a seizure).

pwdisswordfishq 2 days ago | parent | prev | next [-]

Apparently the browser API surface is not obscenely wide enough.

clscott 2 days ago | parent | prev | next [-]

Those exact words are the positioning statement (start the second paragraph) of the document you linked.

What are you trying to say?

benterix 2 days ago | parent | next [-]

Their whole argument is based on this sentence. So I'd expect some rationale. Instead, they provide as "example" links to Google, Microsoft and Apple. The funny thing is that the one by MS is probably the most criticized one, with the company partly backpedaling on it. And Apple is often criticized by LLM aficionados for being quite conservative. Google is the one proposing it.

So my question is: are browsers and operating systems really expected to gain access to language models? If so - by whom: the users or LLM vendors like Google?

loloquwowndueo 2 days ago | parent | prev | next [-]

That “are expected” is a euphemism for “are shoehorning AI in and trying to shove it down users’ throats”. Whereas the truth is nobody (actual end users, that is) wants it.

I hate having to “dodge” all the AI-enabled controls my phone (iOS) is sprouting - I don’t need that shit, but there’s also no alternative.

walletdrainer 2 days ago | parent | prev [-]

> What are you trying to say?

GP is clearly asking ”Are they?”

raincole 2 days ago | parent | prev [-]

Browsers: Chrome (proposed this Prompt API)

Operating Systems: Windows (built-in Copilot), MacOS, iOS (Apple Intelligence)

So it's >90% desktop browser and OS, plus >30% mobile OS.

Yes, I think it's very safe to say "browsers and operating systems are increasingly expected to gain access to language models."

kirb 2 days ago | parent | next [-]

These features are enabled by default, and in the case of iOS/macOS, desktop Chrome, probably also Copilot+ PCs, download 4 - 7 GB local models without properly explaining this to users. This doesn’t confirm any demand because if you just don’t use the features and don’t fill up your device, you may never notice.

I think this API is probably fine, but only if the user already has a model downloaded and wants these features. Naturally, case in point, Chrome quietly downloads Gemini Nano without any opt-out except through group policy. Things like this and Microsoft’s recent admission that they’ve overindexed on Copilot features in Windows make it increasingly difficult to trust that users actually want more than a few killer AI features, most of which are just ChatGPT.

Anecdotally, non-technical friends and family members know about ChatGPT and increasingly Gemini, get frustrated by Copilot, and don’t know Apple Intelligence exists.

https://superuser.com/questions/1930445/can-i-delete-the-chr...

benterix 2 days ago | parent | prev | next [-]

The word "expected" is a weasel word in this context, especially given how muck backlash MS has received. I'd expect a link to a study where users say: "I'd like to have an LLM integrated with my operating system and my browser" and how it changes over time. Then you can seriously argue for "increasingly expected".

deaux 2 days ago | parent | prev | next [-]

You omitted the clause "by shareholders" after "expected".

bigbadfeline a day ago | parent | prev | next [-]

> So it's >90% desktop browser and OS, plus >30% mobile OS. > Yes, I think it's very safe to say "browsers and operating systems are increasingly expected to gain access to language models."

Doesn't follow. Every case you listed justifies LLM inclusion with a similar "everything is expected to be defiled by LLMs" argument, mine is a better wording but still evasively passive and the "expected" part is still nonsense.

Just don't tell me LLM inclusion is justified by "expected" all the way down, like the bottomless money pit it is.

bakugo 2 days ago | parent | prev [-]

What this proves is that browsers and operating systems are increasingly integrating language models, not that they are expected to do so.

The only people who expect them to do so are big tech executives. The average user does not expect nor want Copilot shoved into every possible corner of Windows, and Microsoft themselves have acknowledged this.