Remix.run Logo
PunchyHamster 5 hours ago

Nowadays even many small microcontrollers get AES acceleration so I don't see much reason

chowells 4 hours ago | parent | next [-]

Basically all of the use cases in the article don't make sense with AES. That's not because it's AES. That's because its blocks are significantly larger than the data you want to protect. That's the point the article was making: in very specific circumstances, there is practical value in having the cipher output be small.

fluoridation 3 hours ago | parent | next [-]

In that case just use CTR mode, no?

tptacek 13 minutes ago | parent | next [-]

https://www.cs.ucdavis.edu/~rogaway/papers/thorp.pdf

(Not that this is the only solution but that it motivates the problem of why you can't just naively apply AES to the problem).

201984 an hour ago | parent | prev | next [-]

In the context of encrypting 32 or 64 bit IDs, where there is no nonce, that'd be equivalent to XOR encryption and much weaker than TFA's small block ciphers.

adrian_b an hour ago | parent | next [-]

If you really want to encrypt and decrypt 32-bit numbers without having any nonces available, the fastest way on non-microcontroller CPUs remains using the AES instructions.

You can exploit the fact that the core of AES consists of 32-bit invertible mixing functions. In order to extend AES to 128-bit, a byte permutation is used, which mixes the bytes of the 32-bit words.

The AES instructions are such, that you can cancel the byte permutation. In this case, you can use the AES instructions to encrypt separately four 32-bit words, instead of one 128-bit block.

Similarly by canceling the standard byte permutation and replacing it with separate permutations on the 2 halves, you can make the AES instructions independently encrypt two 64-bit words.

These AES modifications remain faster than any software cipher.

How to cancel the internal permutation and replace it with external shuffle instructions was already described in the Intel white paper published in 2010, at the launch of Westmere, the first CPU with AES instructions.

201984 14 minutes ago | parent [-]

Are you certain using AES is still faster? Let's say for a 32-bit block size and 64-bit key.

From https://en.wikipedia.org/wiki/Speck_(cipher), that Speck combination would use 22 rounds, and using the instruction timings for Zen 5 from https://instlatx64.github.io/InstLatx64/AuthenticAMD/Authent..., it looks like each round would take at most 3 cycles. (Dependency chain for each round is 3 instructions long, ror+add+xor). 22*3 = ~66 cycles.

Using AES with a pshufb to take out the ShiftRows step would be 2 cycles for the pshufb and 4 cycles for each aesenc, and at 10 rounds, you have ~60 cycles.

It's quite close, and to say which one wins, we'd need to actually benchmark it. One is not clearly much faster than the other.

botusaurus 2 minutes ago | parent [-]

maybe the reason they are so close is that the AES microcode is inplementing exactly those operations

fluoridation an hour ago | parent | prev [-]

Would it, though? Either way you're operating in ECB mode with 2^32 or 2^64 values. Why is one more secure than the other?

EDIT: What I mean is you can do cypher = truncate(plain ^ AES(zero_extend(plain))).

201984 8 minutes ago | parent [-]

>EDIT: What I mean is you can do cypher = truncate(plain ^ AES(zero_extend(plain))).

How would you decrypt that though? You truncated 3/4ths of the AES output needed to decrypt it.

I thought you were suggesting this:

  ciphertext = truncate(AES(key) ^ plaintext)
And in this case, since AES(key) does not depend on the plaintext, it would just be XOR by a constant.
Joker_vD 3 hours ago | parent | prev [-]

Some people just itch to use something custom and then to have to think about it. Which can bring amazing results, sure, but it can also bring spectacular disasters as well, especially when we're talking about crypto.

giantrobot 3 hours ago | parent [-]

The article is less about crypto and more about improving UUID (and IDs in general) with small block ciphers. It's a low impact mechanism to avoid leaking data that UUID by design does leak. It also doesn't require a good source of entropy.

adrian_b 2 hours ago | parent | prev [-]

The block size of a block cipher function like AES is important for its security level, but it is completely independent of the size of the data that you may want to encrypt.

Moreover, cryptography has many applications, but the most important 3 of them are data encryption, data integrity verification and random number generation.

The optimal use of a cryptographic component, like a block cipher, depends on the intended application.

If you want e.g. 32-bit random numbers, the fastest method on either Intel/AMD x84-64 CPUs or Arm Aarch64 CPUs is to use the 128-bit AES to encrypt a counter value and then truncate its output to 32 bits. The counter that is the input to AES may be initialized with an arbitrary value, e.g. 0 or the current time, and then you may increment only a 32-bit part of it, if you desire so. Similarly for other sizes of random numbers that are less than 128 bit, you just truncate the output to the desired size. You can also produce random numbers that need to have 1 of a certain number of values that is different from a power of two, by combining either multiplication or division of the output value with rejection done either before or after the operation (for removing the bias).

Similarly, for message authentication, if you have some method that produces an 128-bit MAC, it can be truncated to whatever value you believe to be a good compromise between forgery resistance and message length.

For encryption, short data must be encrypted using either the CTR mode of operation or the OCB mode of operation (where only the last incomplete data block is encrypted using the CTR mode). With these modes of operation, the encrypted data can have any length, even a length that is not an integer number of bytes, without any length expansion of the encrypted message.

The advice given in the parent article is not bad, but it makes sense only in 32-bit microcontrollers, because since 2010 for x86-64 and since 2012 for Aarch64 any decent CPU has AES instructions that are much faster than the implementation in software of any other kind of block cipher.

Moreover, for random number generation or for data integrity verification or for authentication, there are alternative methods that do not use a block cipher but they use a wider invertible function, and which may be more efficient, especially in microcontrollers. For instance, for generating 128-bit unpredictable random numbers, you can use a counter with either an 128-bit block cipher function together with a secret key, or with a 256-bit invertible mixing function, where its 128-bit output value is obtained either by truncation or by summing the 2 halves. In the first case the unpredictability is caused by the secret key, while in the second case the unpredictability is caused by the secret state of the counter, which cannot be recovered by observing the reduced-size output.

For applications where a high level of security is not necessary, e.g. for generating 32-bit random numbers, the already high speed of AES-128 (less than 0.5 clock cycles per output byte on recent CPUs) can be increased by reducing the number of AES rounds, e.g. from 10 to 4, with a proportional increase in throughput.

avidiax 4 hours ago | parent | prev [-]

If you want to encrypt a serial number, you don't want the output to be 256 bits.

adrian_b 2 hours ago | parent | next [-]

The size of encrypted data is completely independent of the block size of a block cipher function that is used for data encryption.

Nowadays, there is almost never any reason to use for encryption any other modes of operation except CTR or OCB, which do not expand the size of encrypted data.

That said, the parent article was less about encryption and more about random number generation, which is done by encrypting a counter value, but you never need to decrypt it again.

In RNGs, the block size again does not matter, as the output can be truncated to any desired size.

SAI_Peregrinus 2 hours ago | parent | prev [-]

AES is most often used in a streaming mode, where it's used to generate a keystream. AES alone is useless, it MUST have a mode of operation to provide any security. A streaming mode can then encrypt any number of bits greater than 0. AES-CTR is one of the more common streaming modes.