| ▲ | thih9 2 days ago | |
> it also would use less electricity How would it use less electricity? I’d like to learn more. | ||
| ▲ | jychang 2 days ago | parent [-] | |
That's completely not true. LLM on device would use MORE electricity. Service providers that do batch>1 inference are a lot more efficient per watt. Local inference can only do batch=1 inference, which is very inefficient. | ||