> It works, I've shipped this as a "local inference"/poor person's ollama for low-end llm tasks like search

fantastic!

> the model download is orders of magnitude greater than downloading the browser itself, and something that needs to happen before you get your first token back

sure but does this mean the model is lazily downloaded? that is, if I used this and I am the first time the model was called, the user would be waiting until the model was downloaded at that point?

that sounds like a horrible user experience - maybe chrome reduces the confusion by showing a download dialog status or similar?

also, any idea what the on disk impact is?

▲ avaer 8 hours ago | parent | next [-]

The model download is lazy and cached, so it's a one-time cost presumably across all origins (I assume so since the alternative would be a trivial DoS waiting to happen).

So it's once per browser, not once per site.

You can track the download state yourself and pop whatever UI you want.

▲ tastroder 7 hours ago | parent | prev | next [-]

chrome://on-device-internals reports "Model Name: v3Nano Version: 2025.06.30.1229 Folder size: 4,072.13 MiB" on a random Windows machine I just checked.

▲

subhobroto 6 hours ago | parent [-]

Thank You stranger! I would have assumed the size would vary based on whether your hardware supports the high-quality GPU backend (4 GB) or defaults to a smaller CPU-compatible version (3 GB) but the 22GB note on that page is really confusing. Even if it was including the model server where's the remaining 18GB going towards?

▲

danpalmer 5 hours ago | parent [-]

I'd imagine that the 22GB was decided through modelling various scenarios. For a start, it's not just a 4GB current model, it's 2x4GB to be able to update it without needing time when the computer is without a model, that's up to 8GB.

Then it's possible the model you get will scale with the CPU/GPU/RAM available, so if you have a 12GB GPU you probably get a better model, perhaps that's a 10-11GB model? At 2x that's 22GB.

Then consider that a machine is not static, GPUs/hardware come and go, VRAM allocation in integrated graphics changes, etc. You end up with just needing to pick a number and not confuse users.

	▲	domenicd 3 hours ago \| parent [-]
		(Former Chrome built-in AI team member here.) This is part of it, and also we just didn't want to use up the last of the user's disk space! It's disrespectful to use up 3 GB if the user only has 4 GB left; it's sketchy if the user only has 10 GB. At 22 GB, we felt there was more room to breathe. One could argue that users should have more agency and transparency into these decisions, and for power users I agree... some kind of neato model management UI in chrome://settings would have been cool. But 99% of users would never see that, so I don't think it ever got built.

▲ why_is_it_good 8 hours ago | parent | prev | next [-]

> Storage: At least 22 GB of free space on the volume that contains your Chrome profile.

▲ dotancohen 7 hours ago | parent | next [-]

Yes, but that is then followed by:

  > Built-in models should be significantly smaller. The exact size may vary slightly with updates.

▲ taejavu 8 hours ago | parent | prev | next [-]

Lmao and here I am still staunchly treating Blazor’s 2MB runtime as a deal-breaker

	▲	qingcharles 7 hours ago \| parent \| next [-]
		If it doesn't fit on a floppy...!
	▲	dotancohen 8 hours ago \| parent \| prev [-]
		Emacs had long ago exceeded eight megs!

▲ subhobroto 8 hours ago | parent | prev [-]

> `> Storage: At least 22 GB of free space on the volume that contains your Chrome profile.`

Yes, I can read and comprehend English and you should assume I read the page. Because of the "At least" wording, I was curious what a person who has actually used the feature has noticed, aka, learning from people who have actually done it already.

▲ jfoster 6 hours ago | parent | prev [-]

Doesn't sound great, but consider how much better this is than every webpage trying to load their own models.

If it turns out useful enough I'm sure browsers will just start including it as (perhaps optional?) part of installation.