▲ | menaerus 3 days ago | |
ClickBench dataset is ~70G IIRC so I find it interesting that they measured such a substantial speedup while only using SSE4.1 (128-bit) - so, not even AVX2 and much less AVX-512. I wonder what the results would be if latter had been the case.And I also wonder if this is (partly) an artifact of more laser-focused utilization of a CPU core ALU and memory subsystem. E.g. crunching more work into a single or pair of instructions are now leaving more space for other unrelated instructions to be retired. |