▲ | orlp 12 hours ago | |
Aarch64 does indeed have a proper atomic max, but even on x86-64 you can get a wait-free atomic max as long as you only need to support integers up to 64. In that case you can simply do a `lock or` with 1 << i as your maximum. You can even support larger sizes by using multiple registers, e.g. four 64-bit registers for a u8 maximum. In most cases it's even better to just store a maximum per thread separately and loop over all threads once to compute the current maximum if you really need it. | ||
▲ | jerrinot 12 hours ago | parent [-] | |
That’s a neat trick, albeit with limited applicability given the very narrow range. Thanks for sharing! |