| ▲ | einpoklum 3 hours ago | |
First - nice writeup which goes into a lot of nooks and crannies. That said, a lot of the user-space "voodoo" is gone if you don't go through CUDA's "runtime API". If you use the driver API, take your kernel source as a string and compile it with NVIDIA's run-time compiler, you'll have better visibility into a lot (not all) of what's going on. For the "raw" version of this, look at: https://github.com/NVIDIA/cuda-samples/tree/master/cpp/0_Int... but for a much more readable, and still fully transparent modern-C++ API version of the same, try this: https://github.com/eyalroz/cuda-api-wrappers/blob/master/exa... that's a sample program for my CUDA API wrappers (header-only) library. | ||
| ▲ | mschuetz 2 hours ago | parent [-] | |
I like the driver API because it allows treating Cuda kernels like hot-reloadable shaders. It's fun to develop while being able to change the code at runtime. | ||