| ▲ | mirekrusin 13 hours ago | |
Local RTX 5090 is actually faster than A100/H100. | ||
| ▲ | aurareturn 12 hours ago | parent [-] | |
It's a $4,000 GPU with 32GB of VRAM and needs a 1,000 watt PSU. It's not realistic for the masses. If it has something like 80GB of VRAM, it'll cost $10k. The actual local LLM chip is Apple Silicon starting at the M5 generation with matmul acceleration in the GPU. You can run a good model using an M5 Max 128GB system. Good prompt processing and token generation speeds. Good enough for many things. Apple accidentally stumbled upon a huge advantage in local LLMs through unified memory architecture. Still not for the masses and not cheap and not great though. Going to be years to slowly enable local LLMs on general mass local computers. | ||