▲ | floam 14 hours ago | ||||||||||||||||||||||
It does. You can use it directly on iOS 26 beta - without writing a line of code I can toy with the on-device model through Shortcuts on my 16 Pro. It’s not meant to be a general purpose chatbot… but it can work as a general purpose chatbot in airplane mode which is a novel experience. https://share.icloud.com/photos/018AYAPEm06ALXciiJAsLGyuA https://share.icloud.com/photos/0f9IzuYQwmhLIcUIhIuDiudFw The above took like 3 seconds to generate. That little box that says On-device can be flipped between On-device, Private Cloud Compute, and ChatGPT. Their LLM uses the ANE sipping battery and leaves the GPU available. | |||||||||||||||||||||||
▲ | JKCalhoun 14 hours ago | parent | next [-] | ||||||||||||||||||||||
Wild to see what improvements might come if there is additional hardware support in future Apple Silicon chips. | |||||||||||||||||||||||
▲ | ivape 13 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
What’s the cost of pointing it to Private Cloud Compute? It can’t be free, can it? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | bigyabai 14 hours ago | parent | prev [-] | ||||||||||||||||||||||
It would be interesting to see the tok/s comparison between the ANE and GPU for inference. I bet these small models are a lot friendlier than the 7B/12B models that technically fit on a phone but won't accelerate well without a GPU. | |||||||||||||||||||||||
|