Remix.run Logo
dooglius 6 days ago

is_heap doesn't seem like a particularly useful operation though, generally a heap is intentionally constructed as such via push_heap/pop_heap

mananaysiempre 6 days ago | parent | next [-]

Indeed that part seems more like an artist’s study than an attempt at actual usefulness, but that’s okay when figuring out if anything at all in the neighbourhood of what you’re trying to do with SIMD is even possible. As far as constructing heaps, don’t forget about (linear-time) heapify, which can be significantly faster if you have a bunch of elements and want to construct a heap with all of them in it. (This doesn’t get you a linear-time heap sort because you’ll still pay the full linearithmic price for the subsequent pop_heaps.)

simonask 6 days ago | parent | prev | next [-]

I think it's fairly useful. It means that you can convert a contiguous array to a heap with a fast O(n) read-only check instead of O(n log n) writes, so if you know that's the common case, you can detect it up front and only revert to normal binary heap insertion if it returns false.

madars 6 days ago | parent | next [-]

By the way, you can also construct heap in linear time - instead of doing n consecutive insertions, at least half of which require log n work, you can apply sift-down operations from bottom up, beginning with the last non-leaf node and working backwards to the root.

That way, roughly half the nodes are leaves (requiring no work), a quarter are at the second-to-last level (requiring at most 1 comparison/swap), an eighth at the third-to-last level (requiring at most 2 comparisons/swaps), and so on. Summing up 1 n/4 + 2 n/8 + ... gets you O(n) total complexity.

See https://en.wikipedia.org/wiki/Heapsort?useskin=monobook#Vari...

gotoeleven 6 days ago | parent | prev [-]

I think the parent comment is asking what process or algorithm is there that would result in an array that was sometimes but not always a heap, and you'd want to do something based on whether the array was in fact a heap or not? Like in your example, what process do you have in mind that might result in a heap and might not?

simonask 3 days ago | parent [-]

Any time you deal with semi-trusted input, like an internal protocol or deserialization. :-)

camel-cdr 6 days ago | parent | prev [-]

make_heap should be vectorizable, that would be more useful. I can also see a path to vectorize bulk insert, but that seems harder.

dragontamer 6 days ago | parent [-]

I don't believe it's possible to vectorize the classic heap.

I've seen vectorized and SIMD heap implementations. They are a different data structure entirely. You basically work with 16-sorted items and then sort your working set (16-items) with any node, allowing you to generalize the push down or push up operations of a heap. (Sort both lists. Top16 make a node and stay here. Bottom16 push down and recurse).

This is very very similar to a classic heap but you need a lot of operations so that you have enough work to SIMD.

Sorting is after all, a highly parallelized operation (Bionic Sort, MergePath, etc) and is a good basis for generalizing single threaded data structures into a multi thread or SIMD version.