Remix.run Logo
CCs 6 days ago

Uses stress-ng for benchmarking, even though the stress-ng documentation says it is not suitable for benchmarking. It was written to max out one component until it burns. Using a real app, like Memcached or Postgres would show more realistic numbers, closer to what people use in production. The difference is not major, 50% utilization is closer to 80% in real load, but it breaks down faster. Stress-ng is nicely linear until 100%, memcached will have a hockey stick curve at the end.

BrendanLong 6 days ago | parent [-]

The advantage of stress-ng is that it's easy to make it run with specific CPU utilization numbers. The tests where I run some number of workers at 100% utilization are interesting since they give such perfect graphs, but I think the version where I have 24 workers and increase their utilization slowly is more realistic for showing how production CPU utilization changes.

BrendanLong 6 days ago | parent [-]

Fun data point though, I just ran three data points of the Phoronix nginx benchmark and got these results:

- Pinned to 6 cores: 28k QPS

- Pinned to 12 cores: 56k QPS

- All 24 cores: 62k QPS

I'm not sure how this applies to realistic workloads where you're using all of the cores but not maxing them out, but it looks like hyperthreading only adds ~10% performance in this case.

BrendanLong 4 days ago | parent | next [-]

Here's results of the Nginx benchmark pinned to 1-24 cores: https://docs.google.com/spreadsheets/d/1d_OK_ckLT1zTA_fG4vkq...

At 51% reported CPU utilization, it's doing about 80% of the maximum requests per second, and it can't get above 80% utilization.

I also added a section: https://www.brendanlong.com/cpu-utilization-is-a-lie.html#bo...

PunchyHamster 6 days ago | parent | prev | next [-]

I'd imagine in this case it's just uncounted usage from OS networking stack

justsomehnguy 6 days ago | parent | prev [-]

You missed to try 18 cores.