| ▲ | sidewndr46 5 hours ago |
| Not to suggest you weren't competent, but did you consider and try and control for the fact that your measurement could be the problem? |
|
| ▲ | magicalhippo 5 hours ago | parent [-] |
| Not going to dismiss it, but I did try to not do stupid stuff. I used QueryPerformanceCounter outside the loop, pinned the benchmark thread to a single core, and the array of elements it processed was fairly large. So I don't think overhead and throttling was an issue. The measurements were very consistent and repeatable. |
| |
| ▲ | sidewndr46 4 hours ago | parent [-] | | Fair enough, I've only really ever found assembly level optimization on embedded microcontrollers to make any degree of sense. Performance optimization usually means something along the lines of "convince co-workers not to implement their own bubble sort" in my lines of work | | |
| ▲ | magicalhippo 2 hours ago | parent [-] | | Yeah, I've also come across a lot of assembly code which was faster 10 years ago, but where the compiler now beats it. So for a while now my take has been to mostly avoid asm, but if needed always have a compiled version, and always do runtime performance detection to select optimal version. |
|
|