We sharded over 20 TB that we know about.

This is probably a typo, right? 20TB isn't that big. I would imagine they've sharded a lot more than that

▲ singron 2 hours ago | parent | next [-]

If your working set is 20 TB, then it's pretty big. Each database has its own mix of hot/cold data, so it's impossible to compare without more information. A better measure might be IOPS. RDS has fairly low maximum IOPS unless you spend a lot more for provisioned IOPS or use Aurora.

▲ rbranson 3 hours ago | parent | prev | next [-]

You are correct. As a point of comparison: almost ten years ago at Segment we had a single Aurora PostgreSQL instance with ~50T of data, it was used to index potential identity data in a much larger corpus of files stored in S3.

▲ GiorgioG 4 hours ago | parent | prev [-]

For a vast majority of use cases 20TB is positively enormous.

	▲	mplanchard 3 hours ago \| parent \| next [-]
		RDS caps out at 64 TB unless you use Aurora, so 20 TB is totally manageable without sharding.
	▲	returningfory2 4 hours ago \| parent \| prev \| next [-]
		This product is for Postgres deployments that are so large they need to be sharded. For these use cases, I think 20TB is about normal.
	▲	jeltz 3 hours ago \| parent \| prev \| next [-]
		Yes. But for most workloads it is not much for PostgreSQL. You often will not have to shard at all.
	▲	happyopossum 4 hours ago \| parent \| prev \| next [-]
		Sure, but 20TB in “the only database you need” is mere hours or minutes worth of data for many workflows.
	▲	tingletech 4 hours ago \| parent \| prev [-]
		that article seems to suggest 20TB total over the dozen deployments in prod.