▲ | anonymoushn a day ago | |
you can have some vector registers n_acc, ns, idx_acc, idxs, then you can do
Edit: I wrote this with min, eq, blend but you can actually use cmpgt, min, blend to avoid having a dependency chain through all three instructions. I am just used to using min, eq, blend because of working on unsigned values that don't have cmpgtyou can consult the list of toys here: https://www.intel.com/content/www/us/en/docs/intrinsics-guid... |