Remix.run Logo
agwa 8 hours ago

> GitHub's migration guide tells developers to treat the new IDs as opaque strings and treat them as references. However it was clear that there was some underlying structure to these IDs as we just saw with the bitmasking

Great, so now GitHub can't change the structure of their IDs without breaking this person's code. The lesson is that if you're designing an API and want an ID to be opaque you have to literally encrypt it. I find it really demoralizing as an API designer that I have to treat my API's consumers as adversaries who will knowingly and intentionally ignore guidance in the documentation like this.

krisoft 8 hours ago | parent | next [-]

> Great, so now GitHub can't change the structure of their IDs without breaking this person's code.

And that is all the fault of the person who treated a documented opaque value as if it has some specific structure.

> The lesson is that if you're designing an API and want an ID to be opaque you have to literally encrypt it.

The lesson is that you should stop caring about breaking people’s code who go against the documentation this way. When it breaks you shrug. Their code was always buggy and it just happened to be working for them until then. You are not their dad. You are not responsible for their misfortune.

> I find it really demoralizing as an API designer that I have to treat my API's consumers as adversaries who will knowingly and intentionally ignore guidance in the documentation like this.

You don’t have to.

vlovich123 6 hours ago | parent | next [-]

Sounds like you’ve maybe never actually run a service or API library at scale. There’s so many factors that go into a decision like that at a company that it’s never so simple. Is the person impacted influential? You’ve got a reputation hit if they negatively blog about how you screwed them after something was working for years. Is a customer who’s worth 10% of your annual revenue impacted? Bet your ass your management chain won’t let you do a breaking change / revert any you made by declaring an incident.

Even in OSS land, you risk alienating the community you’ve built if they’re meaningfully impact. You only do this if the impact is minimal or you don’t care about alienating anyone using your software.

irjustin 6 hours ago | parent [-]

> Sounds like you’ve maybe never actually run a service or API library at scale.

What was the saying? When your scale is big enough, even your bugs have users.

raincole 6 hours ago | parent [-]

Yeah, but when you are big enough you can afford to not care individual users.

VScode once broke a very popular extension that used a private API. Microsoft (righteously) didn't bother to ask if the private API had users.

halestock 6 hours ago | parent | prev [-]

> The lesson is that you should stop caring about breaking people’s code who go against the documentation this way. When it breaks you shrug. Their code was always buggy and it just happened to be working for them until then. You are not their dad. You are not responsible for their misfortune.

Sure, but good luck running a business with that mindset.

Kwpolska 2 hours ago | parent [-]

Apple is pretty successful.

maxbond 8 hours ago | parent | prev | next [-]

You could also say, if I tell you something is an opaque identifier, and you introspect it, it's your problem if your code breaks. I told you not to do that.

lelandfe 7 hours ago | parent [-]

Once "you" becomes a big enough "them" it becomes a problem again.

vlovich123 6 hours ago | parent [-]

Exactly. When you owe the bank $10M it’s a you problem. When you owe the bank $100B it’s a them problem.

bigblind 7 hours ago | parent | prev | next [-]

I think more important than worrying about people treating an opaque value as structured data, is wondering _why_ they're doing so. In the case of this blog post, all they wanted to do was construct a URL, which required the integer database ID. Just make sure you expose what people need, so they don't need to go digging.

Other than that, I agree with what others are saying. If people rely on some undocumented aspect of your IDs, it's on them if that breaks.

plorkyeran 3 hours ago | parent [-]

Exposing what people need doesn’t guarantee that they won’t go digging. It is surprisingly common to discover that someone has come up with a hack that depends on implementation details to do something which you exposed directly and they just didn’t know about it.

vlovich123 6 hours ago | parent | prev | next [-]

Literally how I designed all the public facing R2 tokens like multipart uploads. It’s also a security barrier because forging and stealing of said tokens is harder and any vulnerability has to be done with cooperation of your servers and can be quickly shut down if needed.

kevin_thibedeau 7 hours ago | parent | prev | next [-]

> Great, so now GitHub can't change the structure of their IDs without breaking this person's code

OP can put the decoded IDs into a new column and ignore the structure in the future. The problem was presumably mass querying the Github API to get those numbers needed for functional URLs.

cush 6 hours ago | parent | prev | next [-]

At a big enough scale, even your bugs have users

haileys 8 hours ago | parent | prev | next [-]

This is well understood - Hyrum's law.

You don't need encryption, a global_id database column with a randomly generated ID will do.

maxbond 8 hours ago | parent [-]

You could but you would lose the performance benefits you were seeking by encoding information into the ID. But you could also use a randomized, proprietary base64 alphabet rather than properly encrypting the ID.

pdpi 7 hours ago | parent | next [-]

XOR encryption is cheap and effective. Make the key the static string "IfYouCanReadThisYourCodeWillBreak" or something akin to that. That way, the key itself will serve as a final warning when (not if) the key gets cracked.

Retr0id 7 hours ago | parent | next [-]

Any symmetric encryption is ~free compared to the cost of a network request or db query.

In this particular instance, Speck would be ideal since it supports a 96-bit block size https://en.wikipedia.org/wiki/Speck_(cipher)

pdpi 7 hours ago | parent [-]

Symmetric encryption is computationally ~free, but most of them are conceptually complex. The purpose of encryption here isn't security, it's obfuscation in the service of dissuading people from depending on something they shouldn't, so using the absolutely simplest thing that could possibly work is a positive.

Retr0id 6 hours ago | parent [-]

XOR with fixed key is trivially figure-out-able, defeating the purpose. Speck is simple enough that a working implementation is included within the wikipedia article, and most LLMs can oneshot it.

maxbond 7 hours ago | parent | prev [-]

A cryptographer may quibble and call that an encoding but I agree.

pdpi 7 hours ago | parent [-]

A cryptographer would say that XOR ciphers are a fundamental cryptography primitive, and e.g. the basic building blocks for one-time pads.

maxbond 7 hours ago | parent [-]

Yes, XOR is a real and fundamental primitive in cryptography, but a cryptographer may view the scheme you described as violating Kerckhoffs's second principle of "secrecy in key only" (sometimes phrased, "if you don't pass in a key, it is encoding and not encryption"). You could view your obscure phrase as a key, or you could view it as a constant in a proprietary, obscure algorithm (which would make it an encoding). There's room for interpretation there.

Note that this is not a one-time pad because we are using the same key material many times.

But this is somewhat pedantic on my part, it's a distinction without a difference in this specific case where we don't actually need secrecy. (In most other cases there would be an important difference.)

haileys 8 hours ago | parent | prev [-]

Encoding a type name into an ID is never really something I've viewed as being about performance. Think of it more like an area code, it's an essential part of the identifier that tells you how to interpret the rest of it.

maxbond 8 hours ago | parent [-]

That's fair, and you could definitely put a prefix and a UUID (or whatever), I failed to consider that.

nwallin 8 hours ago | parent | prev | next [-]

Hyrum's law is a real sonuvabitch.

lijok 7 hours ago | parent | prev | next [-]

Can GitHub change their API response rate? Can they increase it? If they do, they’ll break my code ‘cause it expects to receive responses at least after 1200ms. Any faster than that and I get race conditions. I selected the 1200ms number by measuring response rates.

No, you would call me a moron and tell me to go pound sand.

Weird systems were never supported to begin with.

perfmode 7 hours ago | parent | prev | next [-]

The API contract doesn’t stipulate the behavior so GitHub is free to change as they please.

whateveracct 6 hours ago | parent | prev [-]

Who cares if their code is broken in this case? Stupid games stupid prizes.