| ▲ | bee_rider 5 hours ago | ||||||||||||||||
I took a lot longer than I should have to finish my PhD because I wanted to beat well written/properly used vendor code. I wouldn’t recommend it, TBH. It did make my defense a lot easier because I could just point at the graphs and say “see I beat MKL, whatever I did must work.” But I did a lot of little MPI tricks and tuning, which doesn’t add much to the scientific record. It was fun though. I don’t know. Mixed feelings. To some extent I don’t really see how somebody could put all the effort into getting a PhD and not go on a little “I want to tune the heck out of these MPI routines” jaunt. | |||||||||||||||||
| ▲ | shihab 4 hours ago | parent [-] | ||||||||||||||||
To be practically useful, we don't need to beat vendors, just getting close would be enough, by the virtue of being open-source (and often portable). But I found, as an example, PETSc to be ~10x slower than MKL on CPU and CUDA on GPU; It still doesn't have native shared memory parallelism support on CPU etc. | |||||||||||||||||
| |||||||||||||||||