Remix.run Logo
HanClinto 5 hours ago

Yeah, but it can be a bit of a tight squeeze if you don't have at least 24gb (preferably 32gb+) of memory.

Especially if you want other apps to run at the same time, I think it's safer to stick with something more like 9b. You can see a table with quantized sizes here [0] -- yes, there are smaller quants than Q4_K_XL, but then you're down in the weeds with nickel-and-diming things, and if you want to even keep something like a (memory-hungry) instance of VSCode running, good luck.

IMO -- if 9b is doing the job, stick with 9b.

0 - https://github.com/ggml-org/LlamaBarn/pull/63