DeepSeek-v3.2 should be be better for long context because it is using (near linear) sparse attention.