Digital Signatures and Electronic Document Authorization
In the past, most transactions occurred vis-a-vis, and the authority of the persons involved in the transaction was evident, the participants in the transaction recognized one another. In the case where one or other of the participants could not be present to conduct a transaction, the proceedings could be carried out by agents, or persons authorized and reecognized by both parties to the transaction as being the legitimate representatives of those parties. In some cases, these agents were authorized by and were recognized by vitue of instruments, documents, the possession of which conferred authority and provided a means of recognition, the authenticity of which was guaranteed by some token that was peculiar to the authorizing party, i.e. a seal or signature. The extension of this principle to telephonic communication is the recognition of the voice of the participants or the use of some unique, identifying item of knowledge (a credit card number, for instance). The presence of physical barriers (sealed envelopes, secure carriers) and the affixing of an unique authorization (a signature) provided assurance to the recipients that the document's source and contents were authentic.
Analog communications rely upon physical objects transmitted through physical avenues. Digital communications must be protected in a different manner, since both the original and the received copy are equally virtual. That is, the original of document may exist only in virtual form, as digital data on a computer, and the received copy of that original data also would be in digital form. In fact, neither original nor copy would be traceable to an analog reality. The authenticity of any digtial document, therefore cannot be guaranteed, unless the means of creating, the method of transmitting, and the place of receiving can all be secured against tampering and invasion. Since this is typically impossible, given that most transactions occur through public channels, accessible to anyone, the authenticity of a communication cannot be accepted without some external frm of confirmation.
The affixing of a digitized copy of a personal signature, the use of official letterhead, or of other digitized matter cannot provide the necessary assurance of authenticity. A digitized signature can be removed from a legitimate document and placed illegally in a spurious document, and the forgery would be impossible to detect. Official letterhead can be similarly reproduced. The only acceptable digital signature for a digital documant would be one that is incapable of forging and which is intrinsically a function of the document itself, that is, which could only have been affixed to the one document and which cannot be repudiated.
A digital, electronic signature must provide confirmation of authorization, of integrity, and a means by which a document cannot be repudiated.
One form of verification of integrity is possible with DES, the Message Authentication Code (MAC). Since the sender and the recipient share a single secret key, the key can be used to generate a checksum (the MAC) from the message which can be transmitted with the cipher text. Using his own copy of the key, the recipient can then calculate the checksum on the received message and compare it to the MAC. If the two are identical, then the message can be assumed to be intact. And, since the sender and recipient are the two persons assumed to have possession of the secret key, the recipient can, with reasonable assurance, assume that the message came from the sender. The secrecy of the key is the sole guarantee of the authenticity of the message and the MAC. The fact that the key must be kept secret forecloses the possibility of proving the authenticity and integrity of the message to a third party, however, because either the sender or the reicpient could have been the source of the message and the MAC. Without knowledge of the key, the third party must accept on authority that the message has been received intact.
Public key cryptography, in particular, RSA, can provide a means by which the integrity and authenticity of a document can be verified to the satisfaction of any recipient or third party.
First, having encrypted the message using the recipient's public key, the sender computes from the message a "message digest" or "hash", a brief and unique digital fingerprint that is derived from the message by a "one-way message digest function", also known as a "one-way secure hash function". The message digest is a string of letters and numerals that is peculiar to the individual message, but from which nothing about the message can be determined, thus the one-way description. It is similar in nature to the DES MAC, and furnishes an incorruptible test of the integrity of the message. Theortically, then, if a message is intercepted, deciphered, altered in even one bit, re-enciphered and delivered to its intended recipient, the altered message should not produce a digest that is the same as that that had originally been computed for the unaltered message.
Next, the sender "signs" the digest with his/her own private key, that is, computes an encrypted version of the hash using the private RSA key. This "signed" digest is then transmitted to the recipient with the encrypted message.
The recipient of the message uses the sender's public key to decrypt the message digest. The recipient next computes from the received message, using the publicly known hash function, a new message digest, which he compares with the received version. If they match, the recipient is assurred that the message is intact and that the message did, indeed, originate with the purported sender. Because the hash and the signature can both be confirmed using only public information (the hash function and the sender's public key), the authenticity and integrity of the message can be demonstrated to any third party, thereby conferring upon it an undeniable origination. Should any dispute arise subsequently, independent ajudication can prove both the content and the source of the message.
The Secure Hash Algortithm, as defined by Federal Information Processing Standard 180 (FIPS 180) of 11 May, 1993, specifies a 160 bit fixed-length digest for any message of less than 264 (1.844674407 x 1019) bits in length (a text containing approximately 2.305843009 x 1018 characters, sufficient verbosity, one should imagine, for even the most loquacious interlocutor among us.) The FIPS specifies that "it is computationally infeasible" to find a message that corresponds to the digest, and that no two messages shall produce the same digest. The digest is calculated from the text and from certain known constants, using iterative logical operations. In summary, the process of creating the digest is as follows:
The hash function begins with a message. The message data are rendered into their ASCII code equivalents, creating a single, unbroken bit stream that includes all alphanumeric characters, spaces, and punctuation. The bits are taken sequentially four bits at at a time and converted into hexadecimal values. These hexadecimal units are serially grouped together in sets of eight to make 32-bit "words" (Wt).
The message data "words" are combined into uniform blocks (M1,M2, M3,...Mn) of 512 bits. As necessary to make the message length a multiple of 512, the end of the message is padded to fill a last block of 512 bits. The deficient block is measured and the length (L) is made a temporary constant. The digit one (1) is added to the right hand end of the block, and zeroes (0) are appended after the one, such that the length of the block becomes L + 1 + zeroes + 64 = 512. The last 64 bits are reserved as an hexadecimal index of the original length (L) of the deficient block, which is then appended, bringing the total length of the block to 512 bits.
Thus, if the original block were 01100001 01100010 01100011 01100100 01100101 (abcde), 40 bits in length, the message would be padded:
01100001 01100010 01100011 01100100 01100101 1, followed by 407 zeroes, making the block 448 bits in length. The original length (L), is then represented as a 64 bit expression (00000000 00000000 00000000 00000000 00000000 00000000 00000000 00101000) and is added to the block, filling it out to 512 bits. In hexadecimal notation, this would equal 61626364 65800000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000028.
With the padding, the message can then be divided into the blocks of 512 bits each, every block containing sixteen words of 32-bits. Each block is operated upon successively.
Four constants are defined as keys (Kt), 5A827999, 6ED9EBA1, 8F1BBCDC, CA62C1D6. They are each used in succession, each for 20 operations, making a total of 80 operations per message block. Two sets of five registers are also defined, A, B, C, D, and E, and H0, H1, H2, H3, H4. The H registers are given these initial values:
H0 = 67452301 H1 = EFCDAB89 H2 = 98BADCFE H3 = 10325476 H4 = C3D2E1F0.
A TEMP buffer is also created with an initially NULL value.
A block is expanded from sixteen to eighty words, each of 32-bits, using a loop that performs exclusive OR operations with selected words. The sixteen original words become the first sixteen words of the expanded set and are unaltered. The expansion algortihm is contained in the expression:
W(t) = W(t-3) XOR W(t-8) XOR W(t-14) XOR W(t-16)
in whch t is a value from 16 to 79 that is incremented after each execution of the algorithm. In BASIC, the loop would have the form:
FOR t = 16 to 79 W(t) = W(t-3) XOR W(t-8) XOR W(t-14) XOR W(t-16) NEXT t
The registers A, B, C, D, E are then equated with their respective H register counterparts: A = H0, B = H1, C = H2, D = H3, E= H4.
Another loop is established in which bit shifts and additions are performed using words from the expanded eighty word set of the block and the four previously defined keys (Kt). Key one is used for the first 20 executions of the loop (0&endash;19), key two is used in the second twenty executions of the loop (20&endash;39), and so on through the full eighty executions. The algortihm for shifting and adding of words and keys has the form:
TEMP + S^5(A) + f(t;B,C,D) + E + W(t) + Kt
The temporary value (TEMP buffer contents , initially zero) is added to the shifted value of variable A. The expression S^5(A) represents a rotation of the bits of the variable five places by truncating the leftmost five bits and padding the value with an equal number of zeroes on the righthand side of the number. The expression f(t;B,C,D) represents a variable series of logical operations that are performed upon the contents of the three variable registers B, C, and D. The operations performed are determined by the value of t, which is the current counter (a number between 0 and 79 inclusive) of the iterations of the loop within which the computations are being conducted. During the first 20 passes through the loop (0 through 19), f(t;B,C,D) will perform these logical operations: (B AND C) OR ((NOT B) AND D). During the second 20 passes through the loop (20 through 39), f(t;B,C,D) will perform these logical operations: B XOR C XOR D. During the third 20 passes through the loop (40 through 59), f(t;B,C,D) will perform these logical operations: (B AND C) OR (B AND D) OR (C AND D). During the final 20 passes through the loop (60 through 79), the function f(t;B,C,D) will perform these logical operations: B XOR C XOR D.
The results of each iteration of the loop are contained in the variable registers which are then changed as follows:
E = D D = C C = S^30(B) (A rotation or shift of 30 bits in the manner explained above.) B = A A = TEMP
When the 80 loops have been completed, the results are placed in the five H variables, which are defined as follows: H0 = H0 + A, H1 = H1 + B, H2 = H2 + C, H3 = H3 + D, H4 = H4 + E.
The digest is concatenated from the values in H0, H1 , H2 , H3 , H4.
There are other possible methods of computing the same digest, but the methods discussed is amply illustrative of the process by which a message disgest can be computed. The standard as proposed was subsequently found to have a minor weakness that reduced the effective security of the hash algorithm very slightly, but the diminution of security was deemed so slight that all applications containing the algorithm were still certified to be secure with all currently available technology and to remain so until the algorithm shall be emended to remove the flaw.
There are several other hash functions available for use between agreeing parties, among them MD4 and MD5 (MD for Message Digest). These were created by Ronald Rivest, co-author of RSA. They are not part of the NIST standard, but they appear to be at least as secure and useful as the standard. Indeed, the majority of extant applications use one or another of Rivest's message digest algortihms.
The digest or hash establishes for the recipient, or any third party, that the content of the message is intact and has not been altered. In many instances, this is enough. However, it is often necessary to prove that a specific person is the author of the message and, further, that the author has been empowered to issue the message. A digital signature provides these assurances.
The Digital Signature
The Secure Hash Alogorithm creates the digest of a message. The digest can then be "signed" by the sender, encrypting it with the sender's private key. Since the private key is the inverse of the public key, the encrypted digest can be verified as having been created by the sender by decrypting it using the sender's public key. The author and the recipient both benefit from the verification of the authenticity of source and integrity of content of a message. The sender has the assurance that the message will arrive unaltered, or, if altered, will be rejected. The recipient has the assurance that the sender has the authority to generate the message and that the message is unchanged. Additionally, the sender cannot later repudiate the message, since only he could have signed it with his private key.
The algortihm for a digital signature is contained in these expressions:
r = (gk mod p) mod q s = (k-1(SHA(M) + xr)) mod q.
The values of r and s are the signature.
The variables p and q are familiar from the discussion of RSA, p being the prime modulus and q is a prime divisor of (p - 1). The variable x is the sender's private key. The value of k is an integer randomly generated for each message, and g is calculated from the expression g = h(p-1)/q mod p, in which h is an integer greater than 1 and less than (p - 1). The values of p, q, and g, as well as y, the sender's public key, are available to the recipient or to any third party. The values for r and s are also sent to the recipient.
In the above, k - 1 is the multiplicative inverse of k mod q; i.e., (k-1 k) mod q = 1 and 0 < k - 1 < q. The value of SHA(M) is a 160-bit string output by the Secure Hash Algorithm specified in FIPS 180.
For use in computing s, this string must be converted to an integer. If either r = 0 or s = 0, a new value of k should be generated and the signature should be recalculated (it is extremely unlikely that r = 0 or s = 0 if signatures are generated properly" (FIPS 186).
The recipient verifies a message using the published values for p, q, and g, and y. The recipient first checks to see that neither r nor s is equal to zero and that both are less than q. Any failure of these checks results in the rejection of the message. Assuming that the preliminary checks are satisfactory, the recipient then verifies the message by computation using the following expressions:
w = (s)-1 mod q u1 = ((SHA(M))w) mod q u2 = ((r)w) mod q v = (((g)u1 (y)u2) mod p) mod q.
If v equals r then the message is verified and can be accepted as genuine.
For the individual user, this method has the obvious uses in daily commerce, but it also makes possible the archiving of matter in a secure fashion. The user has confidence that the contents of an archive have neither deteriorated nor been corrupted if the archive's signed hash is consistent with a newly computed value. The signature has the additional utility that it can act as a virus detection tool for a binary file such as a program. A signature can be stored separately and then be used at a later time to verify the integrity of the program for which it was originally calculated. A change in the program's content (as when a virus has altered the program code) will be detected when verification is attempted.
©copyright 1998 Horace W. LaBadie, Jr.