Pytorch now has native support for the Blackwell architecture:
https://pytorch.org/blog/pytorch-2-7/
It does, but the performance is pretty bad, worse than Hopper.