Remix clone Hacker News

new | show | ask | jobs Github

	▲	genpfault a day ago
		llama.cpp (b8642) auto-fits ~200k context on this 24GB RX 7900 XTX & it shows a solid 100+ tok/s ("S_TG t/s") on the first 32k of it, nice! ./llama-batched-bench -hf unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_XL \ -npp 1000,2000,4000,8000,16000,32000,64000,96000,128000 -ntg 128 -npl 1 -c 0 \| PP \| TG \| B \| N_KV \| T_PP s \| S_PP t/s \| T_TG s \| S_TG t/s \| T s \| S t/s \| \|-------\|--------\|------\|--------\|----------\|----------\|----------\|----------\|----------\|----------\| \| 1000 \| 128 \| 1 \| 1128 \| 0.416 \| 2404.87 \| 1.064 \| 120.29 \| 1.480 \| 762.20 \| \| 2000 \| 128 \| 1 \| 2128 \| 0.755 \| 2649.86 \| 1.075 \| 119.04 \| 1.830 \| 1162.83 \| \| 4000 \| 128 \| 1 \| 4128 \| 1.501 \| 2665.72 \| 1.093 \| 117.08 \| 2.594 \| 1591.49 \| \| 8000 \| 128 \| 1 \| 8128 \| 3.142 \| 2545.85 \| 1.114 \| 114.87 \| 4.257 \| 1909.47 \| \| 16000 \| 128 \| 1 \| 16128 \| 6.908 \| 2316.00 \| 1.189 \| 107.65 \| 8.097 \| 1991.73 \| \| 32000 \| 128 \| 1 \| 32128 \| 16.382 \| 1953.31 \| 1.278 \| 100.12 \| 17.661 \| 1819.16 \| \| 64000 \| 128 \| 1 \| 64128 \| 43.427 \| 1473.74 \| 1.453 \| 88.12 \| 44.879 \| 1428.89 \| \| 96000 \| 128 \| 1 \| 96128 \| 82.227 \| 1167.50 \| 1.623 \| 78.86 \| 83.850 \| 1146.42 \| \|128000 \| 128 \| 1 \| 128128 \| 133.237 \| 960.69 \| 1.797 \| 71.25 \| 135.034 \| 948.86 \|
	▲	danielhanchen a day ago \| parent \| next [-]
		Oh nice that's pretty good!
	▲	spwa4 13 hours ago \| parent \| prev [-]
		~50 tok/s on M1 Max 64Gb