+1 using llama.cpp Vulkan releases with the Qwen models - runs much better than the ROCm releases.
I'll have to give the preserve_thinking a shot.