Are you running gpt-5.5 on xhigh reasoning? Because I'm seeing a clear difference between that and gpt-5.4 on xhigh.