Remix.run Logo
dwedge 4 hours ago

Is anyone able to explain why it's so much slower when solid state doesn't really care about the data location? Is this simply a quirk of postgres where the index scan requires two reads (unless I'm mistaken) while with mysql the primary key index is the data. I'd be curious to see comparisons here with mysql and also sequential/random read straight from disk

pgaddict 3 hours ago | parent | next [-]

There probably is some additional inefficiency when reading pages randomly (compared to sequential reads), but most of the difference is at the storage level. That is, SSDs can handle a lot of random I/O, but it's nowhere close to sequential reads.

For example, I have a RAID0 with 4 SSDs (Samsung 990 PRO, so consumer, but quite good for reads). And this is what fio says:

# random reads, 8K, direct IO, depth=1

fio --filename=device name --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --readonly

-> read: IOPS=19.1k, BW=149MiB/s (156MB/s)(4473MiB/30001msec)

# sequential reads, 8K, direct IO, depth=1

fio --filename=/dev/md127 --direct=1 --rw=read --bs=8k --ioengine=io_uring --iodepth=1 --runtime=30 --numjobs=1 --time_based --group_reporting --name=random-1 --eta-newline=1 --readonly

-> read: IOPS=85.5k, BW=668MiB/s (700MB/s)(19.6GiB/30001msec)

With buffered I/O, random read stay at ~19k IOPS, while sequential reads get to ~1M IOPS (thanks to read-ahead, either at the OS level, or in the SSD).

So part of this is sequential reads benefiting from implicit "prefetching", which reduces the observed cost of a page. But for random I/O there's no such thing, and so it seems more expensive.

It's more complex (e.g. sequential reads allow issuing larger reads), of course.

ozgrakkurt 3 hours ago | parent | prev | next [-]

NVMe's really do care about location when you hit some concurrency/size limit.

Manufacturers use many hacks like caching writes on disk etc. In my experience, it is rare to have an ssd that actualy behaves like it is expected to.

A solid way of measuring this is using fio with different configurations.

re-thc 3 hours ago | parent | prev | next [-]

> Is anyone able to explain why it's so much slower when solid state doesn't really care about the data location?

It does. Just differently.

E.g. a lot of SSDs nowadays cheap out and save money by using slower and poorer quality NAND + faster and high quality NAND cache. So random often misses the cache a lot more.

convolvatron 4 hours ago | parent | prev [-]

you're right. there are a couple explanations that might have some merit looking at it from the device perspective. one is the the underlying block size is really large, so that looks like a very large cache line that a sequential scan will always hit. its also very likely that there are prefetchers running to try and hide the latency.