We have have had the same experience (qwen3.6 rocks) when we are evaluating local models for our developers in the Norwegian Government https://github.com/navikt/mlx-workspace