It does. You can use it directly on iOS 26 beta - without writing a line of code I can toy with the on-device model through Shortcuts on my 16 Pro. It’s not meant to be a general purpose chatbot… but it can work as a general purpose chatbot in airplane mode which is a novel experience.

https://share.icloud.com/photos/018AYAPEm06ALXciiJAsLGyuA

https://share.icloud.com/photos/0f9IzuYQwmhLIcUIhIuDiudFw

The above took like 3 seconds to generate. That little box that says On-device can be flipped between On-device, Private Cloud Compute, and ChatGPT.

Their LLM uses the ANE sipping battery and leaves the GPU available.

▲

JKCalhoun 14 hours ago | parent | next [-]

Wild to see what improvements might come if there is additional hardware support in future Apple Silicon chips.

▲

ivape 13 hours ago | parent | prev | next [-]

What’s the cost of pointing it to Private Cloud Compute? It can’t be free, can it?

▲

floam 10 hours ago | parent [-]

It’s “free”, as in it doesn’t charge you anything or require a subscription: it’s a part of Apple Intelligence which is basically something bought with the device. It’s in the cloud so theoretically one shouldn’t need a quite new iPhone or Mac but - one does.

	▲	8 hours ago \| parent [-]
		[deleted]

▲

bigyabai 14 hours ago | parent | prev [-]

It would be interesting to see the tok/s comparison between the ANE and GPU for inference. I bet these small models are a lot friendlier than the 7B/12B models that technically fit on a phone but won't accelerate well without a GPU.

▲

gleenn 14 hours ago | parent | next [-]

I thought the big difference between the GPU and ANE was that you couldn't use the ANE to train. Does the GPU actually perform faster during inference as well? Is that because the ANE are designed more for efficiency or is there another bigger reason?

	▲	wmf 14 hours ago \| parent [-]
		GPUs are usually faster for inference simply because they have more ALUs/FPUs but they are also less efficient.

▲

mrheosuper 8 hours ago | parent | prev [-]

fitting 7B model on phone with 8gb ram for the whole system is impressive.