Remix.run Logo
adgjlsfhk1 a day ago

I've yet to see any convincing benchmarks showing that NPUs are more efficient than normal GPUs (that don't ignore the possibility of downclocking the GPU to make it run slower but more efficient)

adastra22 a day ago | parent | next [-]

NPUs are more energy efficient. There is no doubt that a systolic array uses less watts per computation than a tensor operation on a GPU, for these kinds of natural fit applications.

Are they more performant? Hell no. But if you're going to do the calculation, and if you don't care about latency or throughput (e.g. batched processing of vector encodings), why not use the NPU?

Especially on mobile/edge consumer devices -- laptops or phones.

imtringued 17 hours ago | parent | prev [-]

https://fastflowlm.com/benchmarks/

https://fastflowlm.com/assets/bench/gemma3-4b.png