▲ | ThatPlayer 4 days ago | |||||||
Is that using the NPU on that board? I know it's possible to use those too. | ||||||||
▲ | rao-v 3 days ago | parent [-] | |||||||
It is possibly (superb subreddit) but painful to convert a modern model and takes ages for them to be supported. The NPU is energy efficient but no faster than CPU for generation (and has lousy software support). I’m mostly interested in the NPu to run a vision head in parallel with an LLM to speed up time to first token with VLLMs (kinda want to turn them into privacy safe vision devices for consumer use cases) | ||||||||
|