Worth noting the c8gd local NVMe is ephemeral so you'd need to pre-stage the data each run, but for a benchmark like this that's actually ideal since you avoid EBS cold-read artifacts entirely.