Remix.run Logo
scriptsmith 5 hours ago

If Chrome has the #optimization-guide-on-device-model and #prompt-api-for-gemini-nano flags enabled, either because it's part of some Origin Trial / Early Stable Release or something, then web pages will have access to the new Prompt API which allows any webpage to initiate the (one-time) download of the ~2.7 GiB CPU or ~4.0 GiB GPU model using LanguageModel.create()

https://developer.chrome.com/docs/ai/prompt-api

When Chrome 148 releases tomorrow, this will be the default behaviour on desktop.

To download, it should check for 22 GiB free disk space on the volume where your Chrome data dir is, and at least double the model size of free space in your tmp dir.

wuschel 4 hours ago | parent | next [-]

It is a small model, so what utility can I / Google expect from it? What is the on-board model used for?

2ndorderthought 3 hours ago | parent | next [-]

It's not a very good small model to be honest.

That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.

Its a good idea to run small models locally if your computer can host them for privacy and cash saving reasons. But how can you trust Google to autoinstall one on your machine in 2026? I just couldn't do it.

imglorp 2 hours ago | parent | next [-]

Sure, local models good and yes, there's no way we can trust Google.

We can be positive the entire motivation of Chrome is user behavior surveillance. There's not a nano-chance in all the multiverses that Chrome model is doing anything privately. They've gone to extraordinary length to accomplish this. It's not for free.

reactordev 21 minutes ago | parent | next [-]

It is entirely about user surveillance as well as pushing their product on to their users because they have the install base. Google Chrome has become Microsoft IE6 in hostile user behavior.

aftbit 7 minutes ago | parent [-]

You either die a hero or live long enough to see yourself become a villain.

What did we expect when they dropped "don't be evil" from their company values?

akoboldfrying 2 hours ago | parent | prev [-]

I don't trust them either, but the same Google makes Gemma 4 available to run as locally and privately as you want, and those models are pretty amazing for their size.

tsss an hour ago | parent | prev | next [-]

Half of the reason to use local AI is to circumvent the censorship that Google, OpenAI and so on have. I don't want this Google crap on my computer.

soco 2 hours ago | parent | prev [-]

Which is why I uninstalled Chrome a (short...) while ago and my life went on unbothered.

michaelbuckbee 2 hours ago | parent | prev | next [-]

I ran a fairly large production test of this and on _every_ measure except for privacy it was worse than a free tier server hosted LLM.

Not happy about that as I would like to see more local models but that's the current state of things.

https://sendcheckit.com/blog/ai-powered-subject-line-alterna...

gchamonlive 37 minutes ago | parent [-]

> on _every_ measure except for privacy it was worse than a free tier server hosted LLM

Would you be able to compare this to other local models in it's class and a above that would fit consumer-grade hardware?

scriptsmith 4 hours ago | parent | prev [-]

It's based on Gemma 3n, and it's not the best.

I find it works fine for simple classification, translation, interpretation of images & audio. It can write longer prose, but it's pretty bad.

It can also write text in the format of a JSON schema or regexp for anything you might want to do with structured data.

Wowfunhappy 2 hours ago | parent [-]

I wonder why they’re using Gemma 3 and not Gemma 4?

scriptsmith 2 hours ago | parent | next [-]

Google has been trialling the Prompt API in chrome for the over a year, so before Gemma 4 existed. But they are indicating they'll move to Gemma 4: https://groups.google.com/a/chromium.org/g/blink-dev/c/iR6R7...

dotancohen 2 hours ago | parent | prev | next [-]

So that the big news in non-tech news sites will be the update. Thus ensuring that this is received in a positive light.

andy_ppp 2 hours ago | parent | prev [-]

It'll probably update to that without telling you at some point.

tobylane 5 hours ago | parent | prev [-]

Those two (and more) exist in chrome://flags in Chrome 147. I'm disabling them now, with the expectation that will prevent the new default.

One option I'm leaving as default is "Use LiteRT-LM runtime for on-device model service inference." Any comment on that?

RaiausderDose 2 hours ago | parent | next [-]

I'm on Chrome 147 too and disabled:

"optimization-guide-on-device-model"

- Enables optimization guide on device

"prompt-api-for-gemini-nano"

- Prompt API for Gemini Nano

- Prompt API for Gemini Nano with Multimodal Input

and deleted weights.bin and the 2025.x folder in "OptGuideOnDeviceModel"

Will report if Chrome 148 downloads the model again.

phs318u 2 hours ago | parent | next [-]

If you touch those files into existence and chown to root and chmod to 0, it shouldn’t be able to ever overwrite them right?

pmontra 2 hours ago | parent | next [-]

I'm on my phone now so I can't check if something has changed, but what you want to protect from change is the directory, not the files. A file can be deleted and created again if the process can write the directory.

RaiausderDose 2 hours ago | parent | prev [-]

yeah, should work. Will try readonly on windows too.

Now I can't see it anymore, but shouldn't the model be under chrome://on-device-internals/ -> model-status?

Maybe you can uninstall there too.

Markoff an hour ago | parent | prev [-]

thanks, went to flags in Vivaldi and just in case disabled all flags containing "gemini" and first five results for "model"

scriptsmith 5 hours ago | parent | prev [-]

Those flags will exist already, but will default to enabled in 148.

That other flag is for using a different open-source inference engine to the (from what I can tell) closed-source one that's used by default.