Remix clone Hacker News

new | show | ask | jobs Github

	▲	eek2121 3 hours ago
		Not really, Qwen 27b offloads to a decent gaming GPU (RTX 4090 in my case) without needing tons of RAM.
	▲	mathisfun123 2 hours ago \| parent [-]
		can you give more info? llama.cpp vs vllm? config? i wanna try specifically this model