Remix.run Logo
giancarlostoro 3 hours ago

Depends on which variant you pull down, but a single 5090 GPU (I know these are insanely expensive, but for context) could run either the Q8 or Q4_K_M version. It will not fit the 52GB version (BF16) on the other hand. So any modern Mac with a Pro or better processor and more than 52GB of RAM (don't forget VRAM for context window also matters!) would suffice, as someone else noted, probably a 128GB model would do the trick, and give you enough wiggle room to max out the context window.

My Mac only has 16GB of VRAM (20GB total - 8 is reserved for the OS) so I have to leave room for VRAM, I usually find a model that fits in 5 to 7 GB of VRAM and then max the context window as much as I can.

pixelesque an hour ago | parent [-]

Note you can change the amount of shared (V)RAM reserved for the OS with:

sudo sysctl iogpu.wired_limit_mb=18800

will allow you to use more, but you do need to leave a bit for the OS obviously!

giancarlostoro an hour ago | parent [-]

Oh man! I had no idea I could do this at all! What do you usually tweak it to? I feel like 8 GB is probably still a reasonable amount to give the rest of the OS.

pixelesque 11 minutes ago | parent [-]

I've got a 32 GB MBPro, and I set it to 27700, which I haven't seen a problem with so far.