Remix.run Logo
m-schuetz 4 days ago

That and https://github.com/b0nes164/GPUSorting have been a tremendous help for me, since CUB does not nicely work with the Cuda Driver Api. The author is doing amazing work.

mfabbri77 4 days ago | parent [-]

At what order of magnitude in the number of elements to be sorted (I'm thinking to the overhead of the GPU setup cost) is the break-even point reached, compared to a pure CPU sort?

m-schuetz 4 days ago | parent [-]

No idea unfortunately. For me it's mandatory to sort on the GPU because the data already resides on GPU, and copying it to CPU (and the results back to GPU) would be too costly.