Remix clone Hacker News

new | show | ask | jobs Github

	▲	vardump 3 hours ago
		So 235B parameter Qwen3-VL is FP16, so practically it requires at least 512 GB RAM to run? Possibly even more for a reasonable context window? Assuming I don’t want to run it on a CPU, what are my options to run it at home under $10k? Or if my only option is to run the model with CPU (vs GPU or other specialized HW), what would be the best way to use that 10k? vLLM + Multiple networked (10/25/100Gbit) systems?