| ▲ | pjmlp 3 hours ago | |
Yet there are gains of doing e.g. string searches with SIMD, which you naturally aren't going to do in CUDA. | ||
| ▲ | fooker 3 hours ago | parent [-] | |
For sure, it makes sense for nice well defined problems that execute in isolation. Think of the situation where the string search is running on a system that has hyper threading and a bunch of cores, and a normal amount of memory bandwidth. It'll be faster, but at the same time make everything else worse if you overuse vector instructions. (also cherry on top: some modern CPUs automagically lower the clock when they encounter vector instructions!!!) | ||