Remix clone Hacker News

new | show | ask | jobs Github

	▲	sergeivaskov 3 hours ago
		The premise that 'barely any decent size models can run on it' misses the biggest advantage of Apple Silicon: Unified Memory. Where else can you get a machine with 64GB or 128GB of VRAM for running quantized models at this price point? Buying the equivalent VRAM in Nvidia GPUs (like multiple RTX 3090s/4090s) would cost thousands of dollars, draw massive power, and sound like a jet engine. The Mac Mini is dead silent, sips power, and lets you run 70B+ parameter models locally via llama.cpp. It's currently the undisputed king of VRAM-per-dollar for local inference.