| ▲ | MontyCarloHall 11 hours ago |
| The threshold at which the cache gets used is configurable, with 128kB the default. The assumption is that any read larger than the threshold will be a long sustained read, for which latency doesn't matter too much. My question is, do reads <128kB (or whatever the threshold is) from files >128kB get saved to the cache, or is it only used for files whose overall size is under the threshold? Frequent random access to large files is a textbook use case for a caching layer like this, but its cost will be substantial in this system. |
|
| ▲ | the8472 10 hours ago | parent [-] |
| NVMe read latency is in the 10-100µs range for 128kB blocks. S3 is about 100ms. That's 3-4 OOMs.
The threshold where the total read duration starts to dominate latency would be somewhere in the dozens to hundreds of megabytes, not kilobytes. |
| |
| ▲ | 9 hours ago | parent | next [-] | | [deleted] | |
| ▲ | MontyCarloHall 10 hours ago | parent | prev | next [-] | | I agree, it's an oddly low threshold. The latency differential of NFS vs. S3 is a couple OOMs, so a threshold of ~10MB seems more appropriate to me. Perhaps it's set intentionally low to avoid racking up immense EFS bills? Setting it higher would effectively mean getting billed $0.03/GB for a huge fraction of reads, which is untenable for most people's applications. | |
| ▲ | antonvs 10 hours ago | parent | prev [-] | | < NVMe read latency is in the 10-100µs range for 128kB blocks. S3 is about 100ms. That's 3-4 OOMs. Aren't you comparing local in-process latency to network latency? That's multiple OOM right there. | | |
| ▲ | the8472 10 hours ago | parent [-] | | No, within the same DC network latency does not add that much. After all EFS also manages 600µs average latency.
It's really just S3 that's slow. I assume some large fraction of S3 is spread over HDDs, not SSDs. |
|
|