▲ | cldcntrl 5 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots. Not strictly true. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | QuinnyPig 5 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I should have been more clear. You still need to partition, but randomizing the prefixes hasn't been needed since 2018: https://web.archive.org/web/20240227073321/https://aws.amazo... | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | ed_g 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Generally speaking this isn't something Amazon S3 customers need to worry about - as others have said, S3 will automatically scale index performance over time based on load. The challenge primarily comes when customers need large bursts of requests within a namespace that hasn't had a chance to scale - that's when balancing your workload over randomized prefixes is helpful. Please see the documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi... This 2024 re:Invent session "Optimizing storage performance with Amazon S3 (STG328)" which goes very deep on the subject: https://www.youtube.com/watch?v=2DSVjJTRsz8 And this blog that discusses Iceberg's new base-2 hash file layout which helps optimize request scaling performance of large-scale Iceberg workloads running on S3: https://aws.amazon.com/blogs/storage/how-amazon-ads-uses-ice... | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | vvoyer 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This 2024 re:Invent session says exactly the opposite: "If you want to partition your data even better, you can introduce some randomness in your key names": https://youtu.be/2DSVjJTRsz8?t=2206 FWIW The optimal way we were told was to partition our data was to do this: 010111/some/file.jpg. Where `010111/` is a random binary string which will please both the automatic partitioning (503s => partition) and manual partitioning you could ask AWS. Please as in the cardinality of partitions grows slower at each characters vs prefixes like `az9trm/`. We were told that the later version makes manual partitioning a challenge because as soon as you reach two characters you've already created 36x36 partitions (1,296). The issue with that: your keys are no more meaningful if you're relying on S3 to have "folders" by tenants for example (customer1/..). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | rthnbgrredf 5 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Elaborate. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|