Remix.run Logo
cldcntrl 5 days ago

> You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots.

Not strictly true.

QuinnyPig 5 days ago | parent | next [-]

I should have been more clear. You still need to partition, but randomizing the prefixes hasn't been needed since 2018: https://web.archive.org/web/20240227073321/https://aws.amazo...

ed_g 5 days ago | parent | prev | next [-]

Generally speaking this isn't something Amazon S3 customers need to worry about - as others have said, S3 will automatically scale index performance over time based on load. The challenge primarily comes when customers need large bursts of requests within a namespace that hasn't had a chance to scale - that's when balancing your workload over randomized prefixes is helpful.

Please see the documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi...

This 2024 re:Invent session "Optimizing storage performance with Amazon S3 (STG328)" which goes very deep on the subject: https://www.youtube.com/watch?v=2DSVjJTRsz8

And this blog that discusses Iceberg's new base-2 hash file layout which helps optimize request scaling performance of large-scale Iceberg workloads running on S3: https://aws.amazon.com/blogs/storage/how-amazon-ads-uses-ice...

vvoyer 4 days ago | parent | prev | next [-]

This 2024 re:Invent session says exactly the opposite:

"If you want to partition your data even better, you can introduce some randomness in your key names": https://youtu.be/2DSVjJTRsz8?t=2206

FWIW The optimal way we were told was to partition our data was to do this: 010111/some/file.jpg.

Where `010111/` is a random binary string which will please both the automatic partitioning (503s => partition) and manual partitioning you could ask AWS. Please as in the cardinality of partitions grows slower at each characters vs prefixes like `az9trm/`.

We were told that the later version makes manual partitioning a challenge because as soon as you reach two characters you've already created 36x36 partitions (1,296).

The issue with that: your keys are no more meaningful if you're relying on S3 to have "folders" by tenants for example (customer1/..).

rthnbgrredf 5 days ago | parent | prev [-]

Elaborate.

cldcntrl 5 days ago | parent | next [-]

The whole auto-balancing thing isn't instant. If you have a burst of writes with the same key prefix, you'll get throttled.

hnlmorg 5 days ago | parent | prev [-]

Not the OP but I’ve had AWS-staff recommend different prefixes even as recently as last year.

If key prefixes don’t matter much any more, then it’s a very recent change that I’ve missed.

williamdclt 5 days ago | parent | next [-]

Might just be that the AWS staff wasn't up to date on this

time0ut 5 days ago | parent | next [-]

I have had the same experience within the last 18 months. The storage team came back to me and asked me to spread my ultra high throughput write workload across 52 (A-Za-z) prefixes and then they pre-partitioned the bucket for me.

S3 will automatically do this over time now, but I think there are/were edge cases still. I definitely hit one and experienced throttling at peak load until we made the change.

hnlmorg 5 days ago | parent [-]

That’s sounds like the problem we were having. Lots of writes to a prefix over a short period of time and then low activity to it after about 2 weeks.

rthnbgrredf 5 days ago | parent | prev | next [-]

By the way, that happens quite frequently. I regularly ask them about new AWS technologies or recent changes, and most of the time they are not aware. They usually say they will call back later after doing some research.

hnlmorg 5 days ago | parent | prev [-]

That’s possible but they did consult with the storage team prior to our consultation.

But I don’t know what conversations did or did not happen behind the scenes.

cldcntrl 5 days ago | parent | prev [-]

That's right, same for me as of only a few months ago.