Remix.run Logo
bob1029 2 days ago

I don't understand the need for this level of engineering. It appears we are going for an opaque bearer token here. The checksum is pointless because an entire 512 bit token still fits in an x86 cache line. Comparing the whole sequence won't show up in any profiler session you will ever care about.

If you want aspects of the token to be inspectable by intermediaries, then you want json web tokens or a similar technology. You do not want to conflate these ideas. JWTs would solve the stated database concern. All you need to store in a JWT scheme are the private/public keys. Explicit tracking of the session is not required.

notpushkin 2 days ago | parent | next [-]

> The checksum is pointless because an entire 512 bit token still fits in an x86 cache line

I suppose it’s there to avoid round-trip to the DB. Most of us just need to host the DB on the same machine instead, but given sharding is involved, I assume the product is big enough this is undesirable.

phire 2 days ago | parent | next [-]

You need to support revocation, so I'm not sure it's ever possible to avoid the need for a round trip to verify the token.

kukkamario 2 days ago | parent [-]

The point of the checksum is to just drop obviously wrong keys. No need to handle revocation or do any DB access if checksum is incorrect, the key can just be rejected.

ben-schaaf 2 days ago | parent [-]

That sounds like it's only helpful for ddos mitigation, in which case the attacker could trivially synthesize a correct checksum.

phire 2 days ago | parent [-]

You don't have to use a publicly documented checksum.

If you use a cryptographically secure hashing algorithm, mix in a secret salt and use a long enough checksum, attackers would find it nearly impossible to synthesise a correct checksum.

ben-schaaf 2 days ago | parent [-]

I don't follow. The checksum is in "plain text" in every key. It's trivial to find the length of the checksum and the checksum is generated from the payload.

Others have pointed out that the checksum is for offline secret scanning, which makes a lot more sense to me than ddos mitigation.

phire a day ago | parent [-]

I'm not sure it's a good idea.

But it's trivial to make a secret checksum. Just take the key, concatenate it with a secret 256-bit key that only the servers know and hash it with sha256. External users might know the length of the checksum and that it was generated with sha256. But if they don't know the 256-bit key, then it's impossible for them to generate it short of running a brute force attack against your servers.

But it does make the checksum pretty useless for other usecases, as nobody can verify the checksum without the secret.

ben-schaaf a day ago | parent [-]

Ah that makes sense. I wouldn't call that a checksum though; that's a signature :)

phire a day ago | parent [-]

I don't think it counts as a signature, because it can't be verified without revealing the same secret used to create it.

ben-schaaf a day ago | parent [-]

You're right, the correct term seems to be MAC (Message Authentication Code).

rrr_oh_man 2 days ago | parent | prev | next [-]

> I assume the product is big enough

Experience tells otherwise

locknitpicker 2 days ago | parent | prev [-]

> I suppose it’s there to avoid round-trip to the DB.

That assumption is false. The article states that the DB is hit either way.

From the article:

> The reason behind having a checksum is that it allows you to verify first whether this API key is even valid before hitting the DB,

This is absurdly redundant. Caching DB calls is cheaper and simpler to implement.

If this was a local validation check, where API key signature would be checked with a secret to avoid a DB roundtrip then that could see the value in it. But that's already well in the territory of an access token, which then would be enough to reject the whole idea.

If I saw a proposal like that in my org I would reject it on the grounds of being technically unsound.

vjay15 2 days ago | parent | prev | next [-]

Hello bob! the checksum is for secret scanning offline and also for rejecting api keys which might have a typo (niche case)

I just was confused regarding the JWT approach, since from the research I did I saw that it's supposed to be a unique string and thats it!

petterroea 2 days ago | parent | next [-]

I may be naive but I can't imagine anyone typing an api key by hand. Optimizing for it sounds like premature optimization, surely stopping the less than one in a million HTTP request with a hand-typed API key from reaching the db isn't worth anything

vjay15 2 days ago | parent [-]

if not for typo, then I can use for secret scanning then :)

petterroea 2 days ago | parent [-]

Good point!

bob1029 2 days ago | parent | prev | next [-]

The neat thing about JWT is that there are no secrets to scan for. Your secret material ideally lives inside an HSM and never leaves. Scanning for these private keys is a waste of energy if they were generated inside the secure context.

agwa 2 days ago | parent | next [-]

But JWTs are usually used as bearer tokens when doing API authentication. Those are definitely secrets that need to be scanned for.

Or are you suggesting that the API requests are signed with a private key stored in an HSM, and the JWT certifies the public key? Is that common?

bob1029 2 days ago | parent | next [-]

> are you suggesting that the API requests are signed with a private key stored in an HSM, and the JWT certifies the public key? Is that common?

Very. The thing that certifies the public key is called a JWK.

https://datatracker.ietf.org/doc/html/rfc7517

This is typically hosted at a special URL that enables seamless key rotation and discovery.

https://auth0.com/docs/secure/tokens/json-web-tokens/json-we...

mattacular 2 days ago | parent | prev [-]

That's how JWT is designed to work

vjay15 2 days ago | parent | prev [-]

Ideally API key shouldn't contain anything regarding the account or any info right? it's meant to be an opaque string, is what I found in most of the other articles I read. Please do let me know if I am wrong about this assumption

ijustlovemath 2 days ago | parent | next [-]

JWT operates on a different principle; the user's private key (API key) never leaves the user's device. Instead, the stated "role" and other JSON data are signed with the servers pubkey, then verified by the server using its master key, granting the permissions that role allows.

miningape 2 days ago | parent | prev [-]

Look at the JWT standard, it usually contains things like claims, roles, user ids, etc.

arethuza 2 days ago | parent | prev [-]

"for rejecting api keys which might have a type" - assuming that is meant by to be "typo" - won't they get rejected anyway?

vjay15 2 days ago | parent [-]

it's just an added benefit, I don't have to make a DB call to verify that :)

huflungdung 2 days ago | parent [-]

[dead]

Hendrikto 2 days ago | parent | prev [-]

JWTs solve some problems but then come with a lot of their own. I do not think they should be the goto solution.