The 'windows' crate is really good at not being slow.
It ruthlessly uses features to cut down the amount of generated code and it only uses fairly simple #ifdef-like metaprogramming.