Remix.run Logo
madduci 7 hours ago

Like everybody got 128 GB RAM..

sleepyeldrazi 7 hours ago | parent | next [-]

I've been running it almost since launch on a 3090 (24gb vram), you really don't need that much. Second hand those are really cheap and i get 50-70 t/s (with MTP at 2), full ctx. IQ4_NL (unsloth) on this model seems suspiciously competent, and after the (by now not so recent) updates to q4 KV on llama.cpp, I just keep going back to it after dsv4pro disappointed me for the 100th time because it gave up on a task.

dofm 7 hours ago | parent | prev [-]

Doesn't need it at Q4 at least; it'll run in 64GB.

intothemild 5 hours ago | parent [-]

Q6 can run with 256k at Q4 on 32gb easy.

200k @ K : Q5_0 V: 4_1 (which is a bit of a sweet spot)