▲ | bee_rider 14 days ago | |||||||||||||||||||||||||||||||||||||
If I have a mostly CPU code and I want to time the scenario: “I have just a couple subroutines that I am willing to offload to the GPU,” what’s wrong with sprinkling my code with normal old python timing calls? If I don’t care what part of the CUDA ecosystem is taking time (from my point of view it is a black-box that does GEMMs) so why not measure “time until my normal code is running again?” | ||||||||||||||||||||||||||||||||||||||
▲ | nickysielicki 14 days ago | parent [-] | |||||||||||||||||||||||||||||||||||||
If you care enough to time it, you should care enough to time it correctly. | ||||||||||||||||||||||||||||||||||||||
|