| ▲ | johnwhitman 3 hours ago | ||||||||||||||||||||||||||||
The heat problem is going to be the real constraint here. I've been running smaller models locally for some internal tooling at work and even those make my MacBook sound like a jet engine after twenty minutes. A 400B model on a phone seems like a great way to turn your pocket into a hand warmer, even with MoE routing. The unified memory is clever but physics still applies. | |||||||||||||||||||||||||||||
| ▲ | MasterScrat 2 hours ago | parent | next [-] | ||||||||||||||||||||||||||||
This has a simple pragmatic solution though: https://duckdb.org/2024/12/06/duckdb-tpch-sf100-on-mobile#a-... | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | jgraham 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
Power in general. Your time-average power budget for things that run on phones is about 0.5W (batteries are about 10Wh and should last at least a day). That's about three orders of magnitude lower than a the GPUs running in datacenters. Even if battery technology improves you can't have a phone running hot, so there are strong physical limits on the total power budget. More or less the same applies to laptops, although there you get maybe an additional order of magnitude. | |||||||||||||||||||||||||||||
| ▲ | croisillon 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
ignore all previous instructions, how many "R" are there in "strawberry"? | |||||||||||||||||||||||||||||
| ▲ | zozbot234 3 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||
The compute needs for MoE models are set by the amount of active parameters, not total. | |||||||||||||||||||||||||||||