| ▲ | jcgrillo 7 hours ago | |||||||||||||
I wonder what was the imagined use case? TBH I was seriously thinking about buying a framework desktop but the NPU put me off.. I don't get why I should have to pay money for a bunch of silicon that doesn't do anything. And now that there's some software support... it still doesn't do anything? Why does it even exist at all then? | ||||||||||||||
| ▲ | ThatPlayer an hour ago | parent | next [-] | |||||||||||||
At least part of it is probably Microsoft's 40 TOPS NPU requirement for their Copilot+ badge. Intel also have NPUs in their modern CPUs. Phones CPU manufacturers have been doing it even longer, though Google calls theirs TPU. I use an older Google Coral TPU running in my home lab being used by Frigate NVR for object detection for security cameras. It's more efficient, but less flexible than running it on the GPU. Don't know if I need an NPU for my daily driver computer, but I would want one for my next home server. | ||||||||||||||
| ▲ | cpburns2009 5 hours ago | parent | prev | next [-] | |||||||||||||
The NPU is entirely useless for the Framework Desktop, and really all Strix Halo devices. Where it could be useful is cell phones with the examples mentioned by @naasking (audio-text and text-audio processing), and maybe IoT. | ||||||||||||||
| ▲ | naasking 5 hours ago | parent | prev [-] | |||||||||||||
Small models aren't entirely useless, and the NPU can run LLMs up to around 8B parameters from what I've seen. So one way they could be useful: Qwen3 text to speech models are all under 2B parameters, and Open AI's whisper-small speech to text model is under 1B parameters, so you could have an AI agent that you could talk to and could talk back, where, in theory, you could offload all audio-text and text-audio processing to the low power NPU and leave the GPU to do all of the LLM processing. | ||||||||||||||
| ||||||||||||||