Remix.run Logo
seemaze 3 days ago

I think you are correct. I’ve mostly been working with plain llama.cpp, but recently started looking into lemonade for the baked-in NPU support.

data-ottawa 20 hours ago | parent [-]

The NPU us why I started using it. It's cool, but I haven't found a real use case.

My FW Desktop runs 27W on NPU use vs 100W on full GPU use. But the per-watt efficiency seems similar and GPU much faster, so the benefit isn't clear.

The NPU can run while gaming though, so that's useful.