FYI, you can drop down into ptx if need be:
https://github.com/Rust-GPU/Rust-CUDA/blob/aa7e61512788cc702...