The Mystery of 248 and Cofactor DH
The Key generation of Ed25519 signature algorithm
The Mystery of 248 and Cofactor DH
Introduction
I was reading the OpenSSH portable codebase trying to dig out anything interesting and saw very strange lines of code during the key generation of Ed25519 signature algorithm. Curious as always started digging whether or not it is legitimate.
Ed25519
Ed25519 is the Elliptic Curve Signature Algorithm designed by cryptographer djb and his papers on it can be found here. Let’s recall the parameters of the scheme provided in Wikipedia.
is the twisted Edwards curve defined by: and is the unique point whose coordinate is and whose coordinate is positive.- “positive” coordinates are even coordinates (least significant bit is cleared)
- “negative” coordinates are odd coordinates (least significant bit is set)
First of all we know that we are on a prime field so q is prime. Finding such primes is still an interesting problem in Cryptography. Sage verifies our claim:
sage: is_prime(2^(255)-19)
True
The curve defined by the equation above is a twisted Edwards curve. Edwards curves are defined by the equation
where it is untwisted when
sage: q = 2^(255)-19
sage: k = GF(q);
sage: k(121665/121666)
20800338683988658368647408995589388737092878452977063003340006470870624536394
Kinda long, isn’t it? How many digits are there?
sage: len(str(k(121665/121666)))
77
sage: log(20800338683988658368647408995589388737092878452977063003340006470870624536394,2).n()
253.523142230851
So it is 254 bits. Another note in Wikipedia tell us the following:
The curve
If you worry, oh there is a minus sign in sqrt, recall that we are working in a prime field and the following holds for every prime field:
What does birationally equivalent even mean? First let’s define what
rational equivalence means. It means that there exist rational
functions
Of course
Fortunately this is not an article on Sheaves and Schemes. Now, just tell me what the heck is a cofactor, will you?
Number of rationals point on Curve25519
Although curves defined over
where
where
Ed25519 Key Generation in OpenSSH-Portable
Let’s look into the source code. It starts in ssh-keygen.c on line 3799:
break;
default:
if ((r = sshkey_generate(type, bits, &private)) != 0)
fatal("sshkey_generate failed");
break;
}
Here it calls int sshkey_generate(int type, u_int bits, struct sshkey **keyp)
in
sshkey.c. Relevant
lines start at 1753:
switch (type) {
case KEY_ED25519:
if ((k->ed25519_pk = malloc(ED25519_PK_SZ)) == NULL ||
(k->ed25519_sk = malloc(ED25519_SK_SZ)) == NULL) {
ret = SSH_ERR_ALLOC_FAIL;
break;
}
crypto_sign_ed25519_keypair(k->ed25519_pk, k->ed25519_sk);
ret = 0;
break;
So basically here, it allocates enough space and passes pointers for
public and private keys to int crypto_sign_ed25519_keypair( unsigned char *pk, unsigned char *sk )
in
ed25519.c. This
is exactly where the fun begins. Let’s look into
crypto_sign_ed25519_keypair
.
int crypto_sign_ed25519_keypair(
unsigned char *pk,
unsigned char *sk
)
{
sc25519 scsk;
ge25519 gepk;
unsigned char extsk[64];
int i;
randombytes(sk, 32);
crypto_hash_sha512(extsk, sk, 32);
extsk[0] &= 248;
extsk[31] &= 127;
extsk[31] |= 64;
sc25519_from32bytes(&scsk,extsk);
ge25519_scalarmult_base(&gepk, &scsk);
ge25519_pack(pk, &gepk);
for(i=0;i<32;i++)
sk[32 + i] = pk[i];
return 0;
}
Clamping ext[31]
is basically due to the magnitude
Now, do you see the line where the least significant byte of the
secret key extsk[0]
is anded with the number 248? In binary, it is
0b11111000
so it zeroes out the least significant three bits of the
secret key. I was puzzled by this and look for the reasons behind it.
Elliptic Diffie-Hellman KEX
Before delving into the details, let’s review how classic
Diffie-Hellman Key Exchange works. Say there are two parties who want
to generate keys for a symmetric algorithm. First party, say Alice,
generates a random
and sends it to Bob. Similarly, Bob generates a random
Sends this information to Alice. Both then do the final symmetric operation to generate a shared secret:
The security of this scheme depends on a NP-hard problem called
Discrete Logarithm Problem. The problem depends on existence of a
polynomial time algorithm to find
Anyway, no time for Complexity Theory. Let’s look how elliptic
version of DH works. First Alice generate a random scalar
Sends it to Bob. Similarly, Bob picks a random
Sends it to Alice. Both calculate the shared secret similar to the classic version:
This security of this problem is based on the Elliptic Discrete Logarithm problem which is basically the Elliptic Curve version of DLP on prime fields.
The Cofactor Problem
So let’s look into the details of this problem. Basically, the problem states that the least significant 3 bits can be leaked during an Elliptic DH by specially crafting values. I promise you will have a better understanding by the end of this section.
Say the elliptic group
- Calculates:
and sends it to Bob, - Bob picks a random
and calculates:
So there are only 8 possible keys for this KEX depending on the value
of
by comparing it to
To prevent this kind of leak, Bob would basically check whether the
order of Eve’s public key is greater than the cofactor. This can be
accomplished either comparing the values to the ones on the list or by
calculating
Conclusion
In openssh-portable’s implementation, the least significant bits of the secret keys are zeroed out to prevent the leak. You might ask why pre-define these bits? My guess is basically, these bits might leak information about the random number generator of the target, so fixing them to 000 would prevent such an information to be exfiltrated by an adversary. However, this reduces the keyspace by 8 folds.
One note is that, the keys generated in openssh-portable are for signing not DH. So, it might raise the question why fix for DH in a signature algorithm? I’ll let you know when I have the answer.