| ▲ | zozbot234 2 hours ago | ||||||||||||||||||||||
Qwen 3.6 is a toy compared to DeepSeek V4 Flash or Pro. These models can now run on Apple Silicon hardware with as little as 32GB RAM for the Flash (with 2-bit quant, which is still quite capable) using SSD offloading, with just-about-reasonable performance for interactive use, and far better performance on longer contexts than Qwen (due to the more efficient KV cache/attention mechanisms in DeepSeek). Very significant improvements may be viable for unattended inference via large-scale batches, which can reuse sparse experts and thereby mask some of the latency involved - this is quite unique to DeepSeek, again due to its efficient KV cache. | |||||||||||||||||||||||
| ▲ | greenavocado 2 hours ago | parent [-] | ||||||||||||||||||||||
Qwen 3.6 27B still curb stomps Deepseek V4 in coding | |||||||||||||||||||||||
| |||||||||||||||||||||||