Remix.run Logo
binyu 5 hours ago

DeepSeek V4 with 1 million token context window is pretty powerful, although still not there. There's hope that Opus 4.5 level performance locally is not that far away.

Aurornis 5 hours ago | parent | next [-]

Running DeepSeek V4 without extreme quantization locally requires a lot of hardware.

The IQ2 quants that fit into 128GB machines are very degraded.

binyu 5 hours ago | parent [-]

That is true, it is a 1.6T parameters model so it requires a great deal of memory. I also heard there's a 2bit quantization that works well on Apple metal.

tuananh 5 hours ago | parent | prev [-]

From what I read, ds v4 is very close with opus 4.6 performance.

DeathArrow 2 hours ago | parent [-]

The full model is, not the quantized versions.

tuananh an hour ago | parent [-]

yeah that goes without saying. how can openweight, quantized version beat SOTA :)