PTX assembly. Deepseek used some of it to do a little bit of work that CUDA didn't have APIs for.
Sadly platform specific