Bit manipulation instructions are great for high-performance code, too, because they allow conditional computing without branching.
Some real-world examples in simdjson: https://arxiv.org/pdf/1902.08318