v2.2.3 / chapter 11 of 13 / 01 jan 07 / greg goebel / public domain
* Cryptologic techniques such as block ciphers and public-key cryptosystems are now in common use on the Internet, and new ciphers are being introduced to ensure even greater security. This chapter discusses the current state of cryptologic technology.
* The Diffie-Hellman algorithm and public-key cryptography were big steps forward in providing security for the emerging Internet, but a few other pieces were required as well. One of the Internet's many security problems is the fact that forging an email message is very easy. For example, a professor at a university was the target of a malicious prank in which a message containing racist comments was broadcast using his address, causing the professor a great deal of trouble. With public-key cryptography, anyone can use Bob's public key to send him a message, and if the return email of the address of the message has been faked, Bob can be easily fooled into accepting a forgery.
This means that one of the important elements of a useful cryptosystem is "message authentication", or ensuring that Bob can receive a message from Alice and actually be certain it is from Alice. Related to the concept of message authentication are the other concepts of "data integrity", or ensuring that the data in the message has not been tampered with, or corrupted by a transmission error; and "non-repudiation", or ensuring that if Alice sends a contract to Bob, she can't later claim that it was a fabrication and disown it.
Diffie and Hellman devised an elegant little trick that allows public key cryptography to authenticate a message. This trick is based on the fact that RSA, and apparently other public key ciphers, has an interesting symmetry. Normally, in RSA, a message is encrypted with a public key and can only be decrypted with the corresponding private key. Anyone can encrypt a message with the public key, but only one person can decrypt it using the secret private key. The flow of the encryption is "many to one".
It turns out that the reverse is also true: a message can be encrypted with a private key, and can only be decrypted with the corresponding public key. This sort of "inside-out encryption", to give it a name, might seem silly, since then anybody can use the public key to read the message: the flow of the encryption is "one to many". Inside-out encryption provides no security, but the symmetry also holds in that the public key can only decrypt messages encrypted with the secret private key. This means that Alice can perform inside-out encryption using her private key on a message and send it to Bob. Bob then uses her public key to decrypt the message. Since Alice's public key will only decrypt a message that was encrypted with her unknown private key, as long as Bob is certain he has Alice's proper public key, he is just as certain that the message came from Alice.
Encrypting with a private key is more formally known as "signing", and decrypting with a public key is more formally known as "verifying". RSA can of course be used for this kind of authentication, but in the early 1990s NIST also approved a standard "Digital Signature Algorithm (DSA)". DSA can't be used, at least not very practically, for encryption as such, and instead of generating a readable plaintext, it simply provides a YES or NO on whether the message is valid or not.
* One thing to remember is that anyone can read an inside-out encrypted message. All they have to do is get Alice's public key. When this is a problem, the solution is simple: perform inside-out encryption on a message that's encrypted to begin with. Even if someone intercepts Alice's message and opens it up using her public key, all they'll get is a message they can't read. In detail:
The first decryption guarantees that Alice sent the message, the second gives the actual contents of the message.
However, this approach leads to another problem: performing two levels of public key encryption is very time-consuming. As noted, for this reason public key cryptography is generally only used to encipher a key for a block-ciphered message. Similarly, in practice, instead of using this inside-out encryption scheme to validate entire messages, Bob can create a short "fingerprint" of the message called a "hash", or sometimes a "message digest", characteristic of the message, and send the hash to Alice separately using inside-out encryption.
A "hash" is a relatively short string of bits, usually 160 bits long, that results from running the full message through a "hashing" algorithm. Such hashing algorithms are designed to produce a value that is unique for each different message run through the algorithm, or in other words the probability of two different messages resulting in the same hash must be extremely low. Furthermore, even a single minor change in the message must result in a distinctively different hash. Of course, hashes are one-way: there's no way to reconstruct the original message from the hash, any more than a sausage can be turned back into a pig.
There are a number of hash functions used in cryptology, with names like "SHA-1 (Secure Hash Algorithm 1)", "RIPEMD-160", and "MD5 (Message Digest 5)". Incidentally, MD5 was developed by Rivest, who must be something of a workaholic.
A hash can be sent by itself, but of course there's no way to validate who it came from if it is. To authenticate his message, Bob first makes a hash of it using one of these functions, performs inside-out encryption on the hash with his private key, and then sends the encrypted hash on to Alice. This encrypted hash is actually formally known as a "message authentication code (MAC)".
When Alice receives the MAC, she decodes it with Bob's public key. After separately decrypting Bob's full message, she runs the plaintext through the same hashing function used by Bob, and compares the hash generated with the hash she has received. If they match, Alice knows the message was sent by Bob and has not been tampered with.
* There is one more problem, a big one, with Alice using inside-out encryption to establish a digital signature for her messages to Bob. It implies that Bob really does have Alice's public key. Suppose Eve manages to insinuate herself as a secret "middleman" into correspondence between Bob and Alice, intercepting messages from both and then passing them on. Eve can feed her public key to both Bob and Alice, decrypt their messages, then re-encrypt them with Bob's or Alice's public keys as required and pass them on without being detected.
This means that there must be a way of guaranteeing that public keys belong who they are supposed to belong to, or in more formal terms to "validate" the public keys. Organizations known as "certification authorities (CA)" have been established to validate public keys. This service could be provided by a non-profit, commercial, or even governmental organization. The certification authority does not have private keys and so cannot read communications from those it certifies.
Certification is established using "digital certificates" or "certs", which consist of a public key, data on the owner of the public key, an expiration date on the certification, and one or more digital signatures associated with the certification authority. There are a number of different digital certificate specifications, such as the "X.509" standard format, devised by the International Telecommunications Union (ITU). Public key cryptography users can submit their certs to a public database known as a "certification server", allowing them to trade public keys with some degree of security, or a certification authority can provide a "public key infrastructure (PKI)" system, with additional levels of security to ensure that the public keys are valid.
* Block ciphers and RSA provided the tools needed for security on the Internet, but it wasn't until the early 1990s that they became a technology accessible to all, through the work of a crypto hack named Phil Zimmermann. In the late 1980s, Zimmermann came up with a cryptrographic software package named "Pretty Good Privacy (PGP)" that he wanted to make available to the masses on the Internet.
PGP incorporated the key exchange capabilities of RSA with the powerful encryption capabilities of modern block ciphers. As noted earlier, RSA is too computation-intensive to be useful for enciphering long messages and is also relatively weak, and so it is normally only used to send a block cipher key used to encipher the body of the message. Zimmermann decided to use RSA to encipher a randomly-generated one-time "session key", and encipher the body of the message using the session key with a block cipher, by default the IDEA cipher, though alternative ciphers could be used. PGP took care of all the details.
PGP also offered several other convenient features. For example, it could be used to generate the values for a public key and a private key. The user simply wiggled the mouse around, and RSA used the motions as a random seed to help generate large prime number values highly likely to be unique. Another feature provided by PGP was message authentication, using the Diffie-Hellman technique. PGP could use X.509 digital signatures, but it had its own format for digital signatures as well.
* Zimmermann was ready to go with PGP in 1991, but he was confronted with two legal obstacles. First, the RSA algorithm was patented and controlled by RSA Data Security Incorporated. Second, in 1991 the US Senate had passed an omnibus anti-crime bill, with a clause essentially banning encryption schemes that the government could not crack. This clause had been dropped after protests by civil liberties groups and businesses, including RSA Data Security, but nobody felt that the issue had gone away.
Zimmermann was an activist and looked forward to this sort of confrontation as an exercise in principles. In 1991 he had a friend post PGP to an Internet bulletin board site, and after a slow start it began to take off. As PGP caught on, Zimmermann began to attract controversy. He had been hoping to obtain a free license for RSA from RSA Data Security, but the company turned him down and began patent infringement actions against him. RSA Data Security officials branded PGP "Pretty Good Piracy". More significantly, in early 1993, the US government began to investigate him on charges of illegally exporting a "munition", which was how the government classified encryption software.
The legal quarrel went on for over three years. The government abandoned the case against Zimmermann in 1996, out of recognition that the cat was out of the bag and the case against Zimmermann was so controversial that putting him on trial would have been a fiasco, with little likelihood of obtaining a conviction. Zimmermann then came to an understanding with RSA Data Systems over the licensing of RSA. His legal troubles were over. He sold PGP to a company named Network Associates, which he also became affiliated with, but PGP still remains freeware for individual users on the Internet, at "www.pgpi.com".
* The pioneers of public-key cryptography like Whitfield Diffie have been proven right, even conservative, in the use of cryptographic technologies on the Internet. For one particularly important example, most people who surf the Internet to make online purchases have used cryptography to provide their credit-card numbers to vendors, using the "Secure Sockets Layer (SSL)" protocol. Pages protected by SSL are designated with an "https" prefix instead of the conventional "http" prefix.
SSL was invented by Netscape in 1994. It was adopted by the Internet Engineering Task Force's Transport Layer Security (TLS) committee as the basis for a standard in 1996, and the slightly mildly modified variant of SSL that was released by the committee is known by the committee's acronym, "TLS". However, the acronym "SSL" will be used in this document. Incidentally, Microsoft tried to push a competing security scheme designated "Private Communication Technology (PCT)", but Microsoft-bashers will be pleased to know that it didn't catch on. Microsoft ended up adopting SSL.
SSL works more or less transparently to the user, and uses both public-key and symmetric ciphers. Supposed Alice wants to make a purchase from an online vendor and needs to give the vendor her credit-card number. After establishing the vendor's identity using a digital certificate, her web browser will download the vendor's public key and use it to encrypt a secret symmetrical key produced automatically at (more or less) random by the Web browser. The Web browser will pass the encrypted secret key to the vendor, and this secret key will be used to encrypt the session. Incidentally, web browsers set up a table of certs from prominent business concerns when they are installed.
Actually, several secret keys are exchanged, but a detailed discussion of the complexities is a bit beyond the scope of this document. Incidentally, the SSL protocol doesn't try to validate who Alice really is, at least not by default, since obtaining valid digital certificates for typical online customers is not very practical at present. In some critical transactions, a digital certificate may be requested of the purchaser.
* Another invisible use of cryptography on the Internet is in "virtual private networks (VPNs)". As the name implies, a VPN is a private network that can only be accessed by a specified group of users, but still operates on the public Internet. Online security is maintained by IPSec technologies, giving it effective or "virtual" privacy.
VPNs are usually implemented using a scheme different from SSL, known as "Internet Protocol Security (IPSec)" protocols, though the general idea is the same: use public keys to transfer symmetric keys, and then conduct the session using a symmetric cipher. One of the major differences of IPSec from SSL is that SSL is designed to perform secure transactions between a web server and a web user or client, while IPSec performs "peer-to-peer" transactions between multiple, more or less equivalent and effectively similar, nodes on a network.
* The Microsoft Internet Explorer web browser can use the RC4 cipher, a stream cipher developed by Rivest, with 40-bit keys or 128-bit keys; DES with 56-bit keys; or Triple-DES with 168-bit keys. 40-bit keys can be cracked by a brute-force search using serious computing horsepower in a few hours. The amount of horsepower required is expensive enough to ensure that nobody would bother to do this to crack credit-card transactions -- they could be using the computing power to make more money by legitimate means.
* The widespread use of cryptographic techniques and the availability of ever-increasing computing horsepower has led to greater pressure to come up with new and improved encryption systems.
The GNU open-software foundation has developed a "replacement" for PGP known as "GNU Privacy Guard (GNUPG)" is not only publicly available for free, but also does not use either the IDEA block cipher or the RSA algorithm, ensuring freedom from any intellectual-property challenges. The first release of GNUPG was in 1999; the current version can be obtained from "www.gnupg.org".
* Since RSA is computation-intensive, it is slow and cumbersome for use on
handheld computing devices with limited computing horsepower. New public-key
cryptography software based on a class of one-way operations known as
"elliptic curve" functions has been developed, where an elliptic curve
function is of the form:
Y^2 + k1 * X * Y + k2 * Y = X^3 + k3 * X^2 + k4 * X + k5
-- where "k1" through "k5" are constant values. For elliptic curve
cryptography (ECC), this general form of an elliptic curve function is
simplified and modified to:
Y^2 MOD P = (X^3 + k1 * X + k2) MOD P
Without getting into details, this is another algorithm for generating
cryptographic keys that are difficult to backtrack, and it can be used to
generate schemes along the lines of Diffie-Hellman key exchange and RSA. ECC
has two advantages over RSA:
* One of the most significant innovations in contemporary cryptography is the development of a replacement for DES. DES was a worldwide standard for encryption for over two decades, but by the end of the century it was clearly running out of steam. New techniques for cracking ciphers were developed, such as "differential cryptanalysis". This is a technique where slightly different plaintexts are enciphered and then the ciphertexts are compared. Differential cryptanalysis can reduce the length of a brute-force search to crack a DES message by a factor of over a thousand.
In 1999, a group of volunteers running a volunteer network of 100,000 personal computers over the Internet managed to crack a DES-encrypted message in less than a day. The network was controlled by a software system that assigned each of the PCs a block of possible keys to check, with a PC reporting back after the check to indicate if a match had been found. DES users began to use "triple DES", which involved encrypting a message with DES three times, using 56-bit keys. In principle, triple DES uses three 56-bit keys, but in practice some implementations use only two 56-bit keys, with the same key used for the first and third cycles of encryption.
Well before this, in 1997, NIST realized that DES was close to outliving its usefulness and began an evaluation to select a new cipher, the "Advanced Encryption Standard (AES)". Five finalists were selected and examined for weaknesses. All these candidates were regarded as similar in capability and acceptable, but the winner was a block cipher named "Rijndael", pronounced roughly as "raindoll",
Rijndael was developed by two Belgian cryptographers, Vincent Rijmen and Joan Daemen (a guy, by the way), and was derived from an earlier block cipher named "Square". Rijndael can use blocks and keys with a length of 128, 192, or 256 bits, arranged in any combination for a total of nine possible arrangements of key and block size. The cipher can be easily extended to blocks or keys with lengths in larger multiples of 32 bits. Interestingly, Rijmen and Daemen will make little or no money directly off the adoption of their cipher as a standard. However, the prestige and publicity associated with the award will be considerable, and any company with a special expertise in the cipher adopted for AES is likely to find business booming.
* While the Internet has led to aggressive development of the new ciphers, it has also led to a revival of the ancient art of steganography, or hiding data.
Audio and image files tend to be large, and so they make convenient hosts for hidden data. Audio files consist of sequences of digital sound samples that give the value of the audio waveform taken at successive short intervals. For example, standard CD-quality audio consists of 16-bit samples taken at a rate of 44,100 times a second. Data can be hidden in the least significant bit of each sound sample of an audio file. The modification amounts to no more than adding a very slight amount of noise to the file, so slight that nobody could tell the difference between the original and modified files on listening to them (though fanatical audiophiles would no doubt claim they could).
Similarly, a typical image file consists of a grid of color dots, or "pixels", with the color of each pixel defined by three 8-bit values, giving the levels of red, green, and blue in the color. Data can be hidden in the least-significant bit of each of these values, and if the image is a color photograph or other image without large uniform blocks of color, the hidden data cannot be detected simply by looking at the image.
The advantage of steganography is that messages can be distributed without the authorities being aware that they are being sent. There is no reason the data could not be encrypted as well before being hidden. A number of steganography packages are now available for PCs.
Of course, there is a limitation to digital steganography in that "lossy" data compression schemes, which throw away components of the audio or image data to permit a more compact file, cannot be used since they are almost certain to throw away the hidden data. Lossy audio compression schemes such as MP3 or WMA, and lossy image compression schemes such as JPEG, are not appropriate for steganography; non-lossy WAV audio files or PNG image files are much better suited to the job.
It is believed that criminals or international terrorist organizations find steganography using image files a particularly useful way to communicate secret information. One of the problems with any Internet communications from a terrorist's point of view is that they are generally from a specific sender to a specific recipient, allowing security organizations to trace links in the terrorist organization. This problem can be avoided by including secret information in pornographic images, which are then uploaded to a pornographic web site, preferably one with a large amount of traffic. This allows the recipients to hide their pickup of the message in a flood of other traffic to the website.
* Cryptographic systems are also in widespread use outside of the Internet, in the form of "remote key entry (RKE)" systems, sometimes also known as "remote keyless entry" systems. RKE systems have been around for decades in the form of garage-door openers, but the technology has been refined and is now in more widespread use, particularly in the form of RKE systems for high-end cars.
The old garage-door openers are still around, but they provide convenience, not security. The remote and receiver units could be set to unique 8-bit codes using mechanical switches, but this was mainly to eliminate interference between neighbors who both had remotes. Modern automotive RKE systems use a much longer code sequence that changes between uses. If a keychain RKE unit always sent the same or a predictable access code, a potential intruder could simply record its transmission using a radio scanner and play it back later to gain access.
Cheap integrated circuits are now available that provide improved security by generating binary code sequences of 40 bits or so. The simplest such security scheme is the "rolling code" algorithm, basically a form of pseudo-random number generator, which is used with one-way RKE units that can send a code but cannot pick up an acknowledgement from the receiver.
In the rolling code scheme, both the RKE unit the receiver are set to an initial code seed and "rolling algorithm". Every time the key sends an access code to the receiver, both update the code identically according to the rolling algorithm. The initial code seed ensures that the rolling code sequences are effectively unique to a particular RKE system, and the sequences are designed so that it is difficult to backtrack through the values and find the seed. Since the receiver will not always pick up the RKE unit's radio signal, the receiver can "look ahead" 256 codes and still unlock the automobile.
This leads to a problem if the number of unreceived RKE transactions exceeds 256, or somewhat more plausibly the RKE unit is lost or broken. For this reason, automobiles with RKE systems have a "reset" capability. The owner gets in the car using the old-fashioned mechanical key, follows the reset initialization procedure -- say, turning the ignition on and off eight times in less than ten seconds -- and then pushes the button on the RKE unit. The receiver then syncs up to the RKE unit.
Two-way systems offer better security at higher cost. The RKE unit transmits an access code; the receiver reads the code and replies; the RKE unit acknowledges in turn; and the receiver then unlocks the doors. This means that both the RKE unit and the receiver can increment their rolling codes in step, eliminating the need for look-ahead. Incidentally, many RKE units can perform multiple functions, such as unlocking the doors or the trunk. This is done by sending a function code along with the security code.