Signing data structures the wrong way

▲ Signing data structures the wrong way(blog.foks.pub)

91 points by malgorithms 10 hours ago | 42 comments

▲ Retr0id 9 hours ago | parent | next [-]

Putting domain separators in the IDL is interesting but you can also avoid the problem by putting the domain separators in-band (e.g. in some kind of "type" field that is always present).

Tangentially, depending on what your input and data model look like, canonicalisation takes O(nlogn) time (i.e. the cost of sorting your fields).

Here I describe an alternative approach that produces deterministic hashes without a distinct canonicalization step, using multiset hashing: https://www.da.vidbuchanan.co.uk/blog/signing-json.html

▲

majormajor 9 hours ago | parent [-]

I think a lot of people assume that the "name" of the type, for protos, will be preserved somewhere in the output such that a TreeRoot couldn't be re-used as a KeyRevoke. It makes sense that it isn't - you generally don't want to send that name every time - but it's non-obvious to people with a object-oriented-language background who just think "ah, different types are obviously different types." The serialization cost objection is generally what I've often seen against in-bound type fields and such, as well, so having a unique identifier that gets used just for signature computation is clever.

What's over my head possibly, from skimming it, about your multiset hashing is how it avoids the "these payloads have the same shape, so one could be re-sent as the other" issue? It seems like a solution to a different problem?

▲

Retr0id 8 hours ago | parent | next [-]

Multiset hashing is not related to the domain separation problem, but it is related to the broader "signing data structures" problem.

(I realise my comment reads a bit unclearly, it's basically two separate comments, split after the first paragraph)

▲

kccqzy 8 hours ago | parent | prev [-]

This is just a mismatch between nominal typing and structural typing. Protobuf is basically structural typing. You can serialize a message defined with one schema and deserialize the result to a message with a different schema if the two schemata are compatible enough. Almost all normal programming languages use nominal typing. If you have `struct A {int a; int b};` it is distinct from `struct B {int a; int b};`.

	▲	actionfromafar 7 hours ago \| parent [-]
		C does too as a language, but it’s fairly easy to slip up at link time or runtime. At some point the types melt away and you sit there with pointers and offsets. Again, it’s not strictly the language’s fault (I think, I’m far from a standards lawyer).

▲ socketcluster 4 hours ago | parent | prev | next [-]

The crypto dev community has a strange idea that working with binary is superior. For many algorithms, it's not. It just obfuscates what's happening and the performance advantage is negligible... Especially in the context of all the other logic in the system which uses far more resources.

I didn't know that Protobuf wasn't canonical but even without this knowledge, there are many other factors which make it an inferior format to JSON.

Also, on a related topic; it seems unwise that essentially all the cryptographic primitives that everyone is using are often distributed as compiled binaries. I cannot think of anything more antithetical to security than that.

I implemented my own stateful signature algorithm for my blockchain project from scratch using utf8 as the base format and HMAC-SHA256 for key derivation. It makes it so much easier to understand and implement correctly. It uses Lamport OTS with Merkel MSS. The whole thing including all dependencies is like 4000 lines of easy-to-read JavaScript code. About 300 lines of code for MSS and 300 lines for Lamport OTS... The rest are just generic utility functions. You don't need to trust anyone else to "do it right" when the logic is simple and you can read it and verify it yourself! Simplicity of implementation and verification of the code is a critical feature IMO.

If your perfect crypto library is so complex that only 10 people in the world can understand it, that's not very secure! There is massive centralization and supply chain risk. You're hoping that some of these 10 people will regularly review the code and dependencies... Will they? Can you even trust them?

Choosing to use a popular cryptographic library which distributes binaries is basically trading off the risk of implementation mistake for the risk of supply chain attack... Which seems like a greater risk.

Anyway it's kind of wild to now be reading this and seeing people finally coming round to this approach. I've been saying this for years. You can check out https://www.npmjs.com/package/lite-merkle feedback welcome.

▲ lukev 8 hours ago | parent | prev | next [-]

So, isn't this a rather longwinded way to say that a signature only extends to the scope of the message it contains?

It doesn't matter if I sign the word "yes", if you don't know what question is being asked. The signature needs to included the necessary context for the signature to be meaningful.

Lots of ways of doing that, and you definitely need to be thoughtful about redundant data and storage overhead, but the concept isn't tricky.

	▲	maxtaco 7 hours ago \| parent [-]
		Hi, post author here. Agree that the idea isn't tricky, but it seems like many systems still get it wrong, and there wasn't an available system that had all the necessary features. I've tried many of them over the years -- XDR, JSON, Msgpack, Protobufs. When I sat down to write FOKS using protobufs, I found myself writing down "Context Strings" in a separate text file. There was no place for them to go in the IDL. I had worked on other systems where the same strategy was employed. I got to thinking, whenever you need to write down important program details in something that isn't compiled into the program (in this case, the list of "context strings"), you are inviting potentially serious bugs due to the code and documentation drifting apart, and it means the libraries or tools are inadequate. I think this system is nice because it gives you compile-time guarantees that you can't sign without a domain separator, and you can't reuse a domain separator by accident. Also, I like the idea of generating these things randomly, since it's faster and scales better than any other alternative I could think of. And it even scales into some world where lots of different projects are using this system and sharing the same private keys (not a very likely world, I grant you).

▲ efitz 6 hours ago | parent | prev | next [-]

When my data structures are messages to be sent over a network, I always start with msgId and msgLen, both fixed width fields.

This solves the message differentiation problem explicitly, makes security and memory management easier, and reduces routing to:

switch(msg.msgId): …

▲ tantalor 9 hours ago | parent | prev | next [-]

Since the example was given in proto, I'll suggest a solution in proto: add a message option.

  extend google.protobuf.MessageOptions {
    optional uint64 domain_separator = 1234;
  }

  message TreeRoot {
    option (domain_separator) = 4567;
    ...
  }

▲ cogman10 8 hours ago | parent | prev | next [-]

Why not digest the type as part of the hash? This avoids the problem in the article and keeps the transmission size small.

▲

maxtaco 8 hours ago | parent | next [-]

It should be possible to change the name of the type, and this happens often in practice. But type renames shouldn't break preexisting signatures. In this scheme you are free change the type name, and preexisting signatures still verify with new code -- of course as long as you never change the domain separator, which you never should do. Also you'd need to worry about two different projects reusing the same type name. Lastly, the transmission size in this scheme remains unaffected since the domain separators do not appear in the serialized data. Rather, both sides agree on it via the protocol specification.

	▲	actionfromafar 7 hours ago \| parent [-]
		That’s easily addressed. We just need a global immutable registry of types, their names, an alias list and revocation list. ;-) We can let one be managed by ICANN and the others various competing offerings on ETH.

▲

tennysont 8 hours ago | parent | prev [-]

They use a magic number, rather than a digest derived from the schema[1], but otherwise they do as you suggest. The magic number is given to the signing function (sender side) and the validation function (receiver side) but does not increase the size of the transmitted message.

[1]

I think that's what you mean by digest, but maybe you just mean `type` = `magic number`

	▲	patrakov 18 minutes ago \| parent [-]
		The problem here is that a digest derived from the schema would just reintroduce the possibility of confusion of identically encoded but semantically distinct types.

▲ Muromec 8 hours ago | parent | prev | next [-]

So another lesson had been relearned from asn.1. I'm proud of working in this industry again! Next we will figure out to always put versions into the data too

▲

maxtaco 8 hours ago | parent | next [-]

I would say two problems with the asn.1 approach are: (1) it seems like too much cognitive overload for the OIDs to have semantic meaning, and it invites accidental reuse; I think it matters way more that the OIDs are unique, which randomness gets you without much effort; and (2) the OIDs aren't always serialized first, they are allowed to be inside the message, and there are failures that have resulted (https://nvd.nist.gov/vuln/detail/cve-2022-24771, https://nvd.nist.gov/vuln/detail/CVE-2025-12816)

(edit on where the OIDs can be, and added another CVE)

▲

themafia 6 hours ago | parent [-]

Those CVEs seem a little more subtle than OID serialization issues. In the first example there are actually two distinct problems in concert that lead to the vulnerability, one of which is when a "low public exponent" is used.

https://github.com/digitalbazaar/forge/commit/3f0b49a0573ef1...

	▲	tptacek 4 hours ago \| parent \| next [-]
		This is Bleichenbacher's rump-session e=3 RSA attack. It's pretty straightforward, and is in Cryptopals if anyone wants to try it. If you don't check all the RSA padding, and you use e=3, you can just take an integer cube root.
	▲	maxtaco 5 hours ago \| parent \| prev [-]
		It seems like in that PR, the fact that the OID wasn't checked is part of the problem. I think a better system wouldn't compile or would always fail to verify if the OID (domain separator) is wrong, and I think you'd get that behavior in the posted system.

▲

jbmsf 8 hours ago | parent | prev [-]

That was my first thought as well.

▲ formerly_proven 9 hours ago | parent | prev | next [-]

This article claims that these are somewhat open questions, but they're not and have not been for a long time.

#1 You sign a blob and you don't touch it before verifying the signature (aka "The Cryptographic Doom Principle") #2 Signatures are bound to a context which is _not_ transmitted but used for deriving the key or mixed into the MAC or what have you. This is called the Horton principle. It ensures that signer/verifier must cryptographically agree on which context the message is intended for. You essentially cannot implement this incorrectly because if you do, all signatures will fail to verify.

The article actually proposes to violate principle #2 (by embedding some magic numbers into the protocol headers and presuming that someone will check them), which is an incorrect design and will result in bad things if history is any indication.

Principles #1 and #2 are well-established cryptographic design principles for just a handful of decades each.

▲ tennysont 8 hours ago | parent | next [-]

Hmmmm. I agree that an ad-hoc implementation with protobufs can go wrong. But presumably, 1 canonical encoding for the private key constitutes the Horton principle?

It seems like Horton Principle just says "all messages have ≤1 meaning". If a message signed by key X must be parsed using the canonical encoding, then aren't we done?

There is still room for danger. e.g., You send `GetUserPermissionLevel(user:"Alice")` and server responds with `UserNicknameIs(user:"Alice", value:"admin")`. If you fail to check the message type, you might get tricked.

Maybe it's nice if it was mathematically impossible to validate the signature without first providing your assumptions. e.g., The subroutine to validate message `UserNicknameIs(user:"Alice", value:"admin")` requires `ServerKey × ExpectedMessageType`. But "ExpectedMessageType" isn't the only assumption being made, is it?

You might get back `UserPermissionLevel(user:"Bob", value:"admin")` or `UserPermissionLevel(user:"Alice", value:"admin", timestamp:"<3d old>")`. Will we expect the MAC to somehow accept a "user" value? And then what do we do about "timestamp"?

Maybe we implement `ClientMessage(msgUuid: UUID, requestData:...)` and `ServerResponse(clientMsgUuid: UUID, responseData:...)`, but now the UUID is a secret, vulnerable to MITM attack unless data is encrypted.

It seems like you simply must write validation code to ensure that you don't misinterpret the message that is signed. There simply isn't any magic bullet. Having multiple interpretations for a sequence of bytes is a non-starter (addressed in the post). But once you have a single interpretation for a sequence of bytes, isn't it up to the developer to define a schema + validation logic that supports their use case? Maybe there are good off-the-shelf patterns, but--again--no magic bullets?

	▲	themafia 6 hours ago \| parent [-]
		Are keys that expensive to generate? You could have a unique signature key for each data type.

▲ ahtihn 9 hours ago | parent | prev | next [-]

Maybe I'm misunderstanding the article but I'm fairly sure the magic number is not transmitted.

It's used exactly as you say: a shared context used as input for the signature that is not transmitted.

▲ amluto 7 hours ago | parent | next [-]

You’re right, but I think the commenter you’re replying to is also right.

The OP is using unreadable hex strings in a way that obscures what’s actually going on. If you turn those strings into functionally equivalent text, then the signatures are computed over:

    (serialized object, “This is a TreeRoot”)

and the verifier calls the API:

    func Verify(key Key, sig []byte, obj VerifiableObjecter) error

(I assume they meant Object not Objector.)

This API is wrong, full stop. Do not use this design. Sure, it might catch one specific screwup, but it will not catch subtler errors like confusing a TreeRoot that the signer trusts with a TreeRoot that means something else entirely. And it requires canonical encodings, which serves no purpose here. And it forces the verifier to deserialize unverified data, which is a big mistake.

The right solution is to have the sender sign a message, where:

(a) At the time of verification, the message is just bytes, and

(b) The message is structured such that it contains all the information needed to interpret it correctly.

So the message might be a serialization of a union where one element is “I trust this TreeRoot” and another is “I revoke this key”, etc. and the verification API verifies bytes.

If you want to get fancy and make domain separation and forward-and-backward-compatibility easier, then build a mini deserializer into the verifier that deserializes tuples of bytes, or at most UUIDs or similar. So you could sign (UUID indicating protocol v1 message type Foo, serialization of a Foo). And you make that explicit to the caller. And the verifier (a) takes bytes as input and (b) does not even try to parse them into a tuple until after verifying the signature.

P.S. Any protocol that uses the OP’s design must be quite tortured. How exactly is there a sensible protocol where you receive a message, read enough of it to figure out what type (in the protobuf sense) it contains such that there is more than one possible choice, then verify the data of that type? Are they expecting that you have a message containing a oneof and you sign only the oneof instead of the entire message? Why?

▲ lokar 9 hours ago | parent | prev [-]

No, I'm pretty sure they are saying you need to transmit it

▲

nightpool 9 hours ago | parent | next [-]

No, they propose just concatenating it with the data received from the network

> it makes a concatenation of the domain separator (@0x92880d38b74de9fb) and the serialization of the object, and then feeds the byte stream into the signing primitive. Similarly, verification of an object verifies this same reconstructed concatenation against the supplied signature.

> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification. Encrypt, HMAC, and hash work the same way

	▲	tennysont 8 hours ago \| parent \| next [-]
		You are, of course, right. And this distinction is important for this chain of comments. Though, in fairness, that is /kind of/ like transmitting it---in the sense that it impacts the message that is returned. It's more akin to sending a checksum of the magic number, rather than the magic number itself. But conceptually, that is just an optimization. The desire is for the client to ensure the server is using the same magic number, we just so happen to be able to overload the signature to encode this data without increasing the message size.
	▲	lokar 8 hours ago \| parent \| prev [-]
		Oh, it's just in the hash input. So if you don't use the right ID when you check the hash, it fails.

▲

jcalvinowens 9 hours ago | parent | prev | next [-]

I think not:

> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification.

But saying it's about wasting bytes is a little confusing, as you observe that isn't really the point.

▲

jeffrallen 8 hours ago | parent | prev [-]

It is definitely not transmitted.

Domain separation happens in the input to the hash function, not on the wire. Because what arrives off the wire is UNTRUSTED input.

▲ Muromec 8 hours ago | parent | prev | next [-]

The article proposes a way to agree on context out of band and enforce it with idl. This seems to be an implementation of the principle you mention

	▲	amluto 7 hours ago \| parent [-]
		No, it’s completely wrong. It’s a very minor refinement of a terrible yet sadly common design that merely mitigates one specific way that the terrible design can fail. See my other comment here. By the time you call the OP’s proposed verify API you have already screwed up as a precondition of calling the API.

▲ lokar 9 hours ago | parent | prev [-]

What if (and this is perhaps to big an if), you only ever serialize and de-serialize with code generated from the IDL, which always checks the magic numbers (returning a typed object(?

	▲	jeffrallen 8 hours ago \| parent [-]
		It's a big if because the threat model normally includes "bad guys can forge messages". Which means that the input is untrusted and you want to generate your own domain separation bytes for the hash function, not let your attacker choose them.

▲ colek42 5 hours ago | parent | prev | next [-]

DSSE is great for this, if you need more schema use in-toto

▲ jeffrallen 8 hours ago | parent | prev | next [-]

This is a nice explanation of an obvious idea. Both domain separation, and putting the domain signifier into the IDL are fine, but not novel.

Crypto is hard. Do it right. Get help from your tools. 'Nuff said.

Jeeze, I'm getting too old for this crap.

▲ 8 hours ago | parent | prev | next [-]

[deleted]

▲ logicallee 8 hours ago | parent | prev [-]

along the same lines, did you know that you can get an authenticated email that the listed sender never sent to you? If the third party can get a server to send it to themselves (for example Google forms will send them an email with the contents that they want) they can then forward it to you while spoofing the from: field as Google.com in this example, and it will appear in your inbox from the "sender" (Google.com) and appear as fully authenticated - even though Google never actually sent you that.

This is another example where you would think that "who it's for" is something the sender would sign but nope!

▲

tennysont 8 hours ago | parent [-]

I asked about this on the PGP mailing list at one point, and I think I was told that the best solution is to start emails with "Hi <recipient>," which seems like a funny low-tech solution to a (sad) problem.

	▲	HanyouHottie 7 hours ago \| parent [-]
		The solution to this problem without needing to modify your message is to use a protocol that will sign, then encrypt, then sign again. See section 5 here [1] or section 15 here [2]. [1] https://theworld.com/~dtd/sign_encrypt/sign_encrypt7.html [2] https://computerresearch.org/index.php/computer/article/view...