Remix.run Logo
janwas 2 days ago

CPU-time would over-emphasize regions where many threads are running, right? I find wall-time useful for finding serial regions that aren't yet parallelized.

More detail here: https://github.com/dvyukov/perf-load. We recently implemented the same idea without requiring context-switch events: https://github.com/google/highway/blob/master/hwy/profiler.h...