Remix.run Logo
alfiedotwtf 2 hours ago

What is everyone running DeepSeek v4 Flash with?!

It’s currently unsupported on Llama.cpp and vllm doesn’t support GPU+CPU MoE, so unless all of you have an array of DGX Sparks in your bedroom, what’s the secret sauce?!

zozbot234 39 minutes ago | parent [-]

https://www.github.com/antirez/ds4 (from Antirez of Redis fame) runs a 2-bit quant on Apple Silicon hardware and 96GB or 128GB RAM.