Remix clone Hacker News

new | show | ask | jobs Github

	▲	nnx 4 hours ago
		> My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B. This seems high. At which quantization? Using LM Studio or something else? Note: Darkbloom seems to run everything on Q8 MLX.