| ▲ | bastawhiz 6 hours ago | |
1. The list of "scopes" are the object hierarchy that owns the resource. That lets you figure out which shard a resource should be in. You want all the resources for the same repository on the same shard, otherwise if you simply hash the id, one shard going down takes down much of your service since everything is spread more or less uniformly across shards. 2. The object identifier is at the end. That should be strictly increasing, so all the resources for the same scope are ordered in the DB. This is one of the benefits of uuid7. 3. The first element is almost certainly a version. If you do a migration like this, you don't want to rule out doing it again. If you're packing bits, it's nearly impossible to know what's in the data without an identifier, so without the version you might not be able know whether the id is new or old. Another commenter mentioned that you should encrypt this data. Hard pass! Decrypting each id is decidedly slower than b64 decode. Moreover, if you're picking apart IDs, you're relying on an interface that was never made for you. There's nothing sensitive in there: you're just setting yourself up for a possible (probable?) world of pain in the future. GitHub doesn't have to stop you from shooting your foot off. Moreover, encrypting the contents of the ID makes them sort randomly. This is to be avoided: it means similar/related objects are not stored near each other, and you can't do simple range scans over your data. You could decrypt the ids on the way in and store both the unencrypted and encrypted versions in the DB, but why? That's a lot of complexity, effort, and resources to stop randos on the Internet from relying on an internal, non-sensitive data format. As for the old IDs that are still appearing, they are almost certainly: 1. Sharded by their own id (i.e., users are sharded by user id, not repo id), so you don't need additional information. Use something like rendezvous hashing to choose the shard. 2. Got sharded before the new id format was developed, and it's just not worth the trouble to change | ||
| ▲ | inopinatus 2 hours ago | parent [-] | |
AES is faster than base64 on modern CPUs, especially for small messages. | ||