Remix clone Hacker News

new | show | ask | jobs Github

	▲	Havoc 6 hours ago
		I would think a quantized 27b should be doable in mac world too?
	▲	aegis_camera 6 hours ago \| parent \| next [-]
		My prefer is LFM 450M for vision task, QWEN 9B Q4 for Orchestration
	▲	HanClinto 5 hours ago \| parent \| prev [-]
		Yeah, but it can be a bit of a tight squeeze if you don't have at least 24gb (preferably 32gb+) of memory. Especially if you want other apps to run at the same time, I think it's safer to stick with something more like 9b. You can see a table with quantized sizes here [0] -- yes, there are smaller quants than Q4_K_XL, but then you're down in the weeds with nickel-and-diming things, and if you want to even keep something like a (memory-hungry) instance of VSCode running, good luck. IMO -- if 9b is doing the job, stick with 9b. 0 - https://github.com/ggml-org/LlamaBarn/pull/63