Remix.run Logo
chuckadams a day ago

It's the nature of B+ trees, multiplied by the nature of clustered indexes: if you use a UUIDv4 as a primary key, your entire row gets moved to random locations, which really sucks when you normally retrieve them sequentially. With a non-clustered index (say, your UUIDv4 id you use for public APIs when you don't want to leak the v7 info) then you'll still get more fragmentation with the random data, but it's something autovacuum can usually keep up with. But it's more work it has to do on top of everything else it does.

masklinn a day ago | parent [-]

Gp mentioned Postgres, which does not have clustered indexes. It has table clustering, which is a point operation rewriting the entire table but not a persistent property.

chuckadams 15 hours ago | parent [-]

Ah, I forgot CLUSTER was something run by hand on PG. Same footgun then, but you have to load and aim it yourself instead of being fully automatic like it is in MySQL, where it appears you can't opt out of clustering by the PK (similar story in SQL Server, but you can change which index it clusters by). Thanks for the clarification.

evanelias 14 hours ago | parent [-]

This is a bit pedantic, but you’re conflating MySQL with InnoDB. (In MySQL’s model, you can opt out of having a clustered index by choosing a storage engine that doesn’t use clustering.)

In practical terms, choosing a non-InnoDB storage engine is pretty rare with upstream MySQL, but perhaps slightly less so in Percona Server or MariaDB.