Both Mojo and ThunderKittens/HipKittens are viable on AMD.
Mojo runs faster on nvidia hardware than CUDA in some cases.
https://x.com/clattner_llvm/status/1982196673771139466?s=61