| ▲ | entrope 8 hours ago | |||||||
HuggingFace says this model has 753B parameters, which will need a lot more RAM than a maxed-out MacBook Pro. With 40B active parameters, running from SSD would need patience. | ||||||||
| ▲ | _aavaa_ 5 hours ago | parent | next [-] | |||||||
For an fp4 quantization it should fit with room to spare for KVCache | ||||||||
| ||||||||
| ▲ | api 8 hours ago | parent | prev [-] | |||||||
I’ve wondered for a while if anyone is working on very wide channel parallel (kind of like RAID 0) SSD for this purpose. Couple that with a tensor processor and that would be interesting. | ||||||||
| ||||||||