I spent years using both SSH and GnuPG, without really taking care about how were built fingerprints. Lately I started to migrate from ssh-agent to gpg-agent and a new player entered the game: Keygrip. That was enough, I needed to know exactly what was behind these three identifiers.
Table of contents
Why having key identifiers ?
With GnuPG, a key pair can have a length up to 8192 bits. It is really long! Printing the key on the screen takes a lot of place and it is not really convenient to differentiate a key from another. A shorter identifier was needed, here comes the fingerprints.
While using a key pair through GnuPG protocol, up to three different kinds of identifier can show up: GnuPG fingerprint, SSH fingerprint, and Keygrip. Without knowing internal GnuPG or SSH mechanisms, it is not always obvious to understand why a specific identifier is chosen, or how it is related to a key pair.
Basically, each of these identifiers works the same way by hashing information coming from the public part of a key pair.
GnuPG Fingerprint (or key id)
The GnuPG fingerprint is the identifier used to identify a key pair through the GnuPG protocol. It allow to identify a primary key or a sub key, even if nothing in the fingerprint indicates if it is related to a primary key or not.
To better understand how is built the GnuPG fingerprint, a small dive inside GnuPG mechanism is needed. In memory or on disk, GnuPG stores data inside what is called a packet. A packet is a kind of data structure used to simplify access to information. Several types of packet exist depending the information stored. There are comment packet, signature packet, public key packet, plain text packet, etc. (see file packet.h)
The interesting packet here is the packet storing a public key. This packet format has existed for so long that it has evolved several times. Its current version is 4 but the 5th version is coming.
It is possible to check the packet version used for an identity stored into your GnuPG keyring with the argument
user@host:~$ gpg --export firstname.lastname@example.org | gpg --list-packets # off=0 ctb=99 tag=6 hlen=3 plen=525 :public key packet: version 4, algo 1, created 1522017628, expires 0 pkey: [4096 bits] pkey: [17 bits] keyid: DA7931AD47EC6BB4 ...
On this example, the version 4 is used.
What hash algorithm is used ? With public key packet version 4, the SHA1 algorithm is used to build the fingerprint. While with public key packet version 5, the SHA256 algorithm is used. (see function compute_fingerprint in file keyid.c)
What information is hashed ? The information used are the public key parts (e.g. for RSA, modulus n and exponent e), the key creation date, the key algorithm, and the public key packet version. (see function hash_public_key in file keyid.c) Information other than the public parts of the key are parts of the GnuPG protocol.
What does look like the GnuPG fingerprint ? It is composed of a 40 characters hexadecimal string. The reason of theses 40 characters is the use of the SHA1 algorithm. It produces a 160 bits output, converted into hexadecimal values, it makes 40 characters. E.g.:
user@host:~$ gpg -k --with-fingerprint /home/user/.gnupg/pubring.kbx ---------------------------- pub rsa4096 2018-03-25 [C] C894 F357 FB0B 4328 1C2A 04E7 DA79 31AD 47EC 6BB4 ...
If the SHA256 function was used, the format would have been 64 characters because the SHA256 function output is made of 256 bits.
In a lot of situation, where no ambiguity is possible, GnuPG will only print a small suffix of this fingerprint. Sometime just the last 8 characters of the hexadecimal string.
When is used the GnuPG fingerprint? This identifier is mostly used with the
gpg tool to identify which key to use for encryption or signing operations. It is also used in the file name of revocation certificates.
The SSH fingerprint is the identifier used to identify a key pair through the SSH protocol. The process to build this fingerprint is simpler than GnuPG fingerprint. Basically, the SSH protocol is way simpler than GnuPG protocol and is only used for authentication.
What hash algorithm is used ? For this fingerprint, users have the choice of the algorithm. The available algorithms are MD5, SHA1, SHA256, SHA384 and SHA512. At the time of this writing, the default algorithm is SHA256. Not so long ago, it was MD5. (see file digest-openssl.c)
What information is hashed ? The information used are the public key (e.g. for RSA, modulus n and exponent e), and the typename (e.g. the string “ssh-rsa” for RSA). (see function to_blob_buf in file sshkey.c)
As for GnuPG protocol, information other than the public parts of the key are parts of the SSH protocol.
What does look like the SSH fingerprint ? The SSH fingerprint may have many forms depending the hash algorithm used. With one constant, a fingerprint always starts with the hash algorithm name. Here are examples for every available hash algorithms:
user@host:~$ ssh-keygen -E md5 -lf ./testkey.pub MD5:79:1e:34:92:2f:87:7b:63:92:85:e2:c1:38:ec:54:20 user@host:~$ ssh-keygen -E sha1 -lf ./testkey.pub SHA1:1WAf5/bAKTfSy09StuBVlAKLqUs user@host:~$ ssh-keygen -E sha256 -lf ./testkey.pub SHA256:gv3nQD5aF6KERq3HfaCrsn7vMRujPU1HFbVqBArnKwY user@host:~$ ssh-keygen -E sha384 -lf ./testkey.pub SHA384:iL2AFWyOSum7uHBsA0b+mg5S3e0roJm+kCKMRjZFTn13ni7S1uAMyR7SOqObhuab user@host:~$ ssh-keygen -E sha512 -lf ./testkey.pub SHA512:6kAeUdhcQO5Pnx43xtXIjwJ9HFFQoVXm1AkZ40BKOioaAEq3841JEPNSvWuWfl9BwJYcGrHWmnmauGQP8qPtWw
While hexadecimal is still used to represent a MD5 hash, base64 is used to represent SHA hashes. A single hexadecimal character represents 4-bits, while a single base64 character represent 6-bits. So it is more convenient to use base64 to represent long hash.
With MD5 hash, a semi-colon is inserted every two characters. Making the total length of a MD5 hash equal to 37 characters. This semi-colon only has a cosmetic purpose.
|Algorithm Name||Output size (bits)||Output count (characters)|
|MD5||128||128 / 4 = 32 (hexadecimal)|
|SHA1||160||round_up(160 / 6) = 27 (base64)|
|SHA256||256||round_up(256 / 6) =43 (base64)|
|SHA384||384||round_up(384 / 6) =64 (base64)|
|SHA512||512||round_up(512 / 6) =86 (base64)|
When is used the SSH fingerprint? This identifier is used as information to known which identity is used during authentication processes. It may be printed while connecting to a new SSH server, or in any SSH log while using
publickey authentication. It could also be printed when adding or removing an identity to an agent.
To introduce the Keygrip and its purpose, here are two nice citations coming from GnuPG documentation:
The keygrip is a unique identifier for a key pair, it is independent of any protocol, so that the same key can be used with different protocols. PKCS-15 calls this a subjectKeyHash; it can be calculated using Libgcrypt’s gcry_pk_get_keygrip()https://github.com/gpg/gnupg/blob/master/agent/keyformat.txt
To identify a key we use a thing called keygrip which is the SHA-1 hash of an canonical encoded S-Expression of the public key as used in Libgcrypt. For the purpose of this interface the keygrip is given as a hex string. The advantage of using this and not the hash of a certificate is that it will be possible to use the same keypair for different protocols, thereby saving space on the token used to keep the secret keys.https://gnupg.org/documentation/manuals/gnupg.pdf
So a keygrip is protocol agnostic, that means no information coming from GnuPG (e.g. the packet version) nor SSH (e.g. the typename) is used to build them. Only information coming from the key algorithm is used.
What hash algorithm is used ? The keygrip identifier is built with the SHA1 algorithm. (see function _gcry_pk_get_keygrip in file pubkey.c)
What information is hashed ? Only information coming from the public key is used in the hash process. (e.g. for RSA, modulus n and exponent e) Then these information are formatted into a s-expr (e.g.
(public-key(rsa(n...)(e...))) for RSA) (see function keygrip_from_pk in file keyid.c)
What does look like the Keygrip ? Like GnuPG fingerprint (and the public key packet version 4), it is composed of a 40 characters hexadecimal string because of the use of the SHA1 hash algorithm. E.g.:
user@host:~$ gpg -k --with-keygrip /home/user/.gnupg/pubring.kbx ---------------------------- pub rsa4096 2018-03-25 [C] C894F357FB0B43281C2A04E7DA7931AD47EC6BB4 Keygrip = 69B7D1FBF6F48ACA54531CB771088109C081C081 ...
When is used the Keygrip? Because it works with both GnuPG and SSH protocols, this identifier is mainly used by
Keygrips are also used by
gpg-agent in the path of the private key files and to enable a SSH identity inside the sshcontrol file.
The SSH fingerprint may lead to different formats but with the hash algorithm name as a prefix, it is easy to discern it from a GnuPG fingerprint or from a keygrip.
Right now, both GnuPG fingerprint and keygrip have the same format, a 40 characters hexadecimal string. It could sometime lead to a bit of confusion. With public key packet version 5, the issue should be solved by producing a GnuPG fingerprint made of 64 characters.
While these three identifiers all use elements coming from the public part of a key, the way information is provided to the hash function, the extra information provided by each protocol (GnuPG or SSH), and the choice of the hash function produce totally different output, even while using the same key pair.
The hash algorithm SHA256 has clearly became the new standard since a while, replacing both MD5 and SHA1. So it may seem strange to still use SHA1 to build keygrips.