A Codes & Ciphers Primer

v1.0.4 / 01 jul 13 / greg goebel / public domain

* Codes and ciphers have been around for millennia, often proving of vital strategic importance. They are still in widespread use in the 21st century and are an important component to the operation of the internet. This document provides a brief introduction to codes and ciphers.

3-rotor Enigma cipher machine



* The oldest means of sending secret messages is to simply conceal them by one trick or another. The ancient Greek historian Herodotus wrote that when the Persian Emperor Xerxes moved to attack Greece in 480 BC, the Greeks were warned by an Greek named Demaratus who was living in exile in Persia. In those days, wooden tablets covered with wax were used for writing. Demaratus wrote a message on the wooden tablet itself and then covered it with wax, allowing the vital information to be smuggled out of the country.

The science of sending concealed messages is known as "steganography", Greek for "concealed writing". Steganography has a long history, leading to inventions such as invisible ink and "microdots", or highly miniaturized microfilm images that could be hidden almost anywhere. Microdots are a common feature in old spy movies and TV shows. However, steganography is not very secure by itself. If someone finds the hidden message, all its secrets are revealed. That led to the idea of manipulating the message so that it could not be read even if it were intercepted, and the result was "cryptography", Greek for "hidden writing".

Cryptography takes two forms: "codes" and "ciphers". The distinction between codes and ciphers is commonly misunderstood. A "code" is essentially a secret language invented to conceal the meaning of a message. The simplest form of a code is the "jargon code", in which a particular arbitrary phrase, for an arbitrary example:

   The nightingale sings at dawn.

-- corresponds to a particular predefined message that may not, in fact shouldn't have, anything to do with the jargon code phrase. The actual meaning of this might be:

   The supply drop will take place at 0100 hours tomorrow.

Jargon codes have been used for a long time, most significantly in World War II, when they were used to send commands over broadcast radio to resistance fighters. However, from a cryptographic point of view they're not very interesting. A proper code would run something like this:


This uses "codewords" to report that a friendly military force codenamed BOXER SEVEN is now hunting an enemy force codenamed TIGER5 at a location codenamed RED CORAL. This particular code is weak in that the "SEEK" and "AT" words provide information to a codebreaker on the structure of the message. In practice, traditional military codes are often defined using "codenumbers" instead of codewords, listed in a codebook that provides a dictionary of code numbers and their equivalent words. For example, this message might be coded as:

   85772 24799 10090 59980 12487

Codewords and codenumbers are referred to collectively as "codegroups". The words they represent are referred to as "plaintext" or, more infrequently, "cleartext", "plaincode", "placode", or "plaindata".

Codes are unsurprisingly defined by "codebooks", which are dictionaries of codegroups listed with their corresponding plaintext. Codes originally had the codegroups in the same order as their plaintext. For example, in a code based on codenumbers, a word starting with "a" would have a low-value codenumber, while one starting with "z" would have a high-value codenumber. This meant that the same codebook could be used to "encode" a plaintext message into a coded message or "codetext", and "decode" a codetext back into plaintext message.

However, such "one-part" codes had a certain predictability that made it easier for outsiders to figure out the pattern and "crack" or "break" the message, revealing its secrets. In order to make life more difficult for codebreakers, codemakers then designed codes where there was no predictable relationship between the order of the codegroups and the order of the matching plaintext. This meant that two codebooks were required, one to look up plaintext to find codegroups for encoding, the other to look up codegroups to find plaintext for decoding. This was in much the same way that a student of a foreign language, say French, needs an English-French and a French-English dictionary to translate back and forth between the two languages. Such "two-part" codes required more effort to implement and use, but they were harder to crack.

* In contrast to a code, a "cipher" conceals a plaintext message by replacing or scrambling its letters. This process is known as "enciphering" and results in a "ciphertext" message. Converting a ciphertext message back to a plaintext message is known as "deciphering". Coded messages are often enciphered to improve their security, a process known as "superencipherment".

There are two classes of ciphers. A "substitution cipher" changes the letters in a message to another set of letters, or "cipher alphabet", while a "transposition cipher" shuffles the letters around. In some usages, the term "cipher" always means "substitution cipher", while "transpositions" are not referred to as ciphers at all. In this document, the term "cipher" will mean both substitution ciphers and transposition ciphers. It is useful to refer to them together, since the two approaches are often combined in the same cipher scheme. However, transposition ciphers will be referred to in specific as "transpositions" for simplicity.

"Encryption" covers both encoding and enciphering, while "decryption" covers both decoding and deciphering. This also implies the term "cryptotext" to cover both codetext and ciphertext, though the term "encicode" is sometimes seen instead. The science of creating codes and ciphers is known, as mentioned, as "cryptography", while the science of breaking them is known as "cryptanalysis". Together, the two fields make up the science of "cryptology".



* A simple substitution cipher in which the same cipher letter is always exchanged for the same plaintext letter is known as a "monoalphabetic substitution cipher". For example, we could define a cipher alphabet as follows:

   plaintext alphabet:   abcdefghijklmnopqrstuvwxyz

Given the plaintext:

   erase the tapes

-- and the cipher alphabet above, we get:


Note that in this example plaintext is printed in lowercase, while ciphertext is printed in uppercase. This convention will be followed in the rest of this document.

Monoalphabetic substitution ciphers go back to at least the fourth century BC. One of the simplest monoalphabetic substitution ciphers, known as a "Caesar shift", associated with Julius Caesar, involves shifting letters by a number of positions, say three:

   plaintext alphabet:  abcdefghijklmnopqrstuvwxyz

Using this cipher alphabet, Alice can convert the plaintext:

   beware the ides of march

-- into the ciphertext:


This can be made even more cryptic by removing the spaces:


-- and it still remains more or less readable when translated back to plaintext:


With 26 letters in the English alphabet, there are of course 25 different possible Caesar shift cipher alphabets. All Bob needs to read the cipher is a number from 1 to 25 to define the shift. This number can be thought of as a "key" associated with the Caesar shift enciphering "algorithm".

A Caesar shift cipher is ridiculously easy to crack, since all one has to do is try all 25 Caesar shift cipher alphabets until one works. Interestingly, however, it is still in use, in the form of the "rot13" scheme used on internet forums, which is a 13-place Caesar-shift used to hide the punchlines of jokes and the like.

* A more secure way to build a substitution cipher is to completely mix up the mappings between the plaintext and ciphertext alphabets. There are vast number of possible ways to scramble the alphabet, and such a scrambling might seem on the face of it very secure. One way to come up with a mixed cipher alphabet is for Alice to take a "keyphrase" consisting of, say, a name, such as RICHARD MILHAUS NIXON, write it down while eliminating any redundant letters, and then complete the cipher by writing down the remaining letters of the alphabet in alphabetical order:

   plaintext alphabet:  abcdefghijklmnopqrstuvwxyz

This is a simple cipher algorithm, but even if a codebreaker knows that this general scheme was used, the message still cannot be read without the keyphrase, and a brute-force approach to cracking it is very difficult. This is a fundamental principle of cryptography, stated by a 19th-century Dutch linguist & cryptographer, Auguste Kerckhoffs von Niewenhof), and known as "Kerckhoffs' Principle": The security of a cipher should not depend on an enemy's ignorance of the enciphering algorithm, only the enemy's ignorance of the key. In fact, codebreaking is often focused on discovering keys, since the cipher scheme may be well understood.



* For an example of a transposition, suppose Alice wants to send Bob the message:

   meet me near the clock tower at twelve midnight tonite

One way to transpose this message is for Alice to "write in" the words vertically in five rows without any spaces as follows:

   m e e t m
   e n e a r 
   t h e c l 
   o c k t o
   w e r a t 
   t w e l v 
   e m i d n 
   i g h t t 
   o n i t e 

-- and then "read off" each column, top to bottom, as follows:

   metowteio  enhcewmgn  eeekreihi  tactaldtt  mrlotvnte

Bob then "writes in" the message in five parts:

   M E T O W T E I O
   E N H C E W M G N
   E E E K R E I H I
   T A C T A L D T T
   M R L O T V N T E

-- and then "reads off" the message from the columns:

   meet me near the clock tower at twelve midnight tonite

A transposition of the form shown above is extremely easy to crack. Holmes just writes down the letters of the transposition in rows, increasing the length of the rows until he sees something that makes sense. However, Alice could make things more difficult for a codebreaker a bigger headache by reading off columns in an alternating "down" and "up" fashion, or by reading off the transposition in a "spiral" pattern -- "down" on the left side, "right" across the bottom, "up" on the right side, "left" across the top to the second-to-left column, "down" again, and so on until all letters were transposed. Even more sophisticated transpositions use a "checkerboard" pattern -- such as a "knight's tour", a grid of numbers that specify the sequence of movements of a chess knight across the grid.



* Given the large number of possible monoalphabetic substitution cipher alphabets, it might seem like a substitution cipher would be very hard to break. In reality, it's very easy if given a reasonably large ciphertext message to analyze, but it took over a thousand years to figure out how.

The basic approach for cracking a monoalphabetic substitution cipher was invented by a multi-talented medieval Arabic scholar named al-Kindi, and is now known as "frequency analysis". His work was an outgrowth of efforts by Arabs to perform textual analyses of religious texts to see if they actually were written by the Prophet.

Frequency analysis is a statistical method. In every language, some letters are used on the average more than others, and the percentages of letters in different languages tends to be constant. For example, in English text the three most common letters on the average are "e", "t", and "a", while the three least common are "x", "q", and "z". (Average letter frequencies will be different in different languages.) This means that if a codebreaker sees that "O", "G", and "B" are the most common letters in a ciphertext, then they are likely to represent "e", "t", and "a".

Frequencies of characters in any text may deviate from the average, so this mapping may not be perfectly accurate. However, frequency analysis can also be performed on pairs of letters, or "digraphs", in the ciphertext -- "ee" is common in English text, but not "aa". A codebreaker can also spot common triplets, or "trigraphs", or entire words -- "the" is the most common word in English text. Text may have predictable elements -- for example, Nazi correspondence might start with "Heil Hitler!" Such predictable phrases are known as "cribs". Military correspondence tends to follow standard formats and is often loaded with cribs.

Once given these clues, a codebreaker can then use educated guesswork to "fill in the blanks". For example, an incomplete word of the form "-u-m-ri-e" in text sent from a naval base is likely to be "submarine". This is the same skill that is used to solve crossword-puzzles, and is known to cryptologists as "anagramming". The usage of the word in cryptology is somewhat different from the popular usage, which refers to a scrambling of the letters of one word into a different word.

If the frequency analysis of a ciphertext shows a seeming match to normal text, then the cipher is likely to be a transposition. If frequency analysis seems to show a different mapping of characters from normal text, but trying to plug the proper characters back into the ciphertext gives absolutely no useful results, then the cipher is likely a combination substitution and transposition.

Of course, frequency analysis requires enough text to permit the construction of a useful table of letter frequencies. This is a general truth of cryptanalysis: the more cryptotext available, the easier the cryptotext is to crack, and on the other side of the coin, short or fragmentary cryptotexts can be difficult to crack even if they use an insecure cryptographic scheme.



* Solving a monoalphabetic substitution cipher is easy. Solving even a simple code is difficult. Decrypting a coded message is a little like trying to translate a document written in an alien language, with the task basically amounting to building up a "dictionary" of the codegroups and the plaintext words they represent.

One fingerhold on a simple code is the fact, mentioned in the previous section, that some words are more common than others, such as "the" or "a" in English. In telegraphic messages, the codegroup for "STOP" (end of sentence) is usually very common. This helps define the structure of the message in terms of sentences, if not their meaning. Further progress can be made against a code by collecting many messages encrypted with the same code and then obtaining intelligence background on the messages, such as the location from where a message was sent, and where it was being sent to; the time the message was sent; events occurring before and after the message was sent; and the normal habits of the people sending the coded messages. Cribs can be helpful as well.

Various tricks can be used to "plant" or "sow" information into a code, for example by executing a raid at a particular time and location against an enemy, and then examining coded messages sent in response to the raid. Coding errors are a particularly useful lever against a code, and naturally people are bound to make errors, sometimes disastrous ones, sooner or later. Of course, planting information and exploiting errors works against ciphers as well.

* The most obvious and, in principle at least, simplest way of cracking a code is to steal the codebook through bribery, burglary, or raids. Constructing a new code is like building a new language and writing a dictionary for it -- a big job. If a code is compromised, the whole task has to be done all over again, and that means a lot of work for both cryptographers and the code users. In practice, when codes were in widespread use, they were usually changed on a periodic basis.

Once codes have been created, their distribution is logistically clumsy, and the probability that the code will be compromised is high. In contrast, the security of ciphers is, as mentioned earlier, generally dependent on protecting the cipher keys. Cipher keys can be stolen and people can betray them, but they are much easier to change and communicate.



* The West learned about frequency analysis in the 15th century, forcing the development of better encryption schemes:

One of the most significant developments was the "polyalphabetic substitution cipher", which was described in its definitive form in a paper published in 1586 by a French diplomat named Blaise de Vigenere. A "Vigenere cipher" uses 26 substitution ciphers, organized using a "Vigenere square" as shown below, with some spacing added here to make it legible:

        a bcd efgh ijk lmno pqr stuv wyxz


        a bcd efgh ijk lmno pqr stuv wyxz

This defines 26 different Caesar shift ciphers, each of which is weak in itself but which in combination result in a much more secure cipher. The idea in the Vigenere cipher is to use a cipher key to select different cipher alphabets in succession as letters are enciphered. Suppose Alice wants to encipher the phrase:

   use the force luke

-- with a Vigenere cipher, using the cipher keyword "WARTHOG". All she has to do is scan down the square defined above and match the cipher alphabet letter to a particular row, then select the cipher character matching the plaintext letter for that row:

   W: cipher alphabet 23 gives  u -> Q
   A: cipher alphabet 01 gives  s -> S
   R: cipher alphabet 18 gives  e -> V
   T: cipher alphabet 20 gives  t -> M
   H: cipher alphabet 08 gives  h -> O
   O: cipher alphabet 15 gives  e -> S
   G: cipher alphabet 07 gives  f -> L
   W: cipher alphabet 23 gives  o -> K
   A: cipher alphabet 01 gives  r -> R
   R: cipher alphabet 18 gives  c -> T
   T: cipher alphabet 20 gives  e -> X
   H: cipher alphabet 08 gives  l -> S
   O: cipher alphabet 15 gives  u -> I
   G: cipher alphabet 07 gives  k -> Q
   W: cipher alphabet 23 gives  e -> A

This gives:


Simple frequency analysis cannot crack a Vigenere cipher, and the number of possible keys is so great that finding the right key by trial-and-error is effectively impossible. Despite the simplicity and elegance of the Vigenere cipher, it was mostly ignored for the next several centuries, since it was regarded as too cumbersome for general use.



* The invention of telegraphy in the mid-19th century revolutionized communications, and also had a significant impact on cryptology. Since telegrams were paid for on a word-by-word basis, there was an incentive to reduce the word count, and so "commercial codes" were developed from the outset to provide what might be called in modern terms "data compression". They also provided some security against casual reading, though they were one-part codes and not generally intended to provide security. Commercial codes did generally offer superencipherment schemes for users who wanted more secrecy.

Commercial codes remained in use until well after the First World War. Some of the entries in a commercial code published in the 1920s seemed almost tailored for encrypting contemporary pulp fiction:

   BUKSI    Avoid arrest if possible.
   OBNYX    Escape at once.
   PYTUO    Collided with an iceberg.
   YBDIG    Plundered by natives.
   CULKE    Bad as possibly can be.

The telegraph had a particular impact on military operations, contributing along with the invention of the locomotive and improved firearms to a revolution in warfare. However, the telegraph was vulnerable to interception, meaning that good cryptographic schemes were required to ensure security. Codes really couldn't do the job for an army on the move in the field, because of the logistical difficulty of handling the codebooks. The Vigenere cipher came into common use, assisted by a simple invention called a "cipher disk", described below. However, although the Vigenere cipher had been regarded as indecipherable for a long time, crytanalysts were finally able to get a handle on it. For example, if multiple messages encrypted using a Vigenere cipher and the same cipher key were intercepted, frequency analysis could be used across the first letter of all the messages in parallel, the second letter, and so on.

* Other cipher schemes were developed for military telegraphic use. During the American Civil War, Union forces used a "route cipher", basically a transposition cipher based on entire words, not characters. A small set of codewords were implemented to conceal words that might provide too much of a clue to snoopers, with cipher books distributed describing various transposition patterns and the codewords to use.

A general technique for cracking transpositions called "multiple anagramming" was developed in the 1870s. It requires two messages that have been transposed in the same way -- for a simple example, consider two messages consisting only of one five-letter word each:


A little examination shows that both of these two transpositions could be unscrambled into two different words:


Given each message on its own, there's no way to figure out which of the two unscramblings would be correct for each message. However, performing the same unscrambling on both messages in parallel shows that there's only one unscrambling that yields an intelligent answer for both messages:


   12345      15243     45213     42513

* A number of new cipher techniques were also developed based on the "Polybius square" or "checkerboard". This was a way of converting letters into pairs of numbers, devised in classical Greek times by a scholar named Polybius. Updated into English, a Polybius square consists of the letters of the alphabet arranged as a 5-by-5 checkerboard grid with rows and columns numbered, and "I" and "J" assigned to the same grid location:

        1   2   3   4   5

   1    A   B   C   D   E
   2    F   G   H  I/J  K
   3    L   M   N   O   P
   4    Q   R   S   T   U
   5    V   W   X   Y   Z

The row-and-column index numbers are then substituted for letters. For example, "help" becomes "23 15 31 35". For added security, the arrangement of the letters in the grid can be scrambled, preferably by using a cipher keyword to determine the scrambling. For example, if we use the key "RICHARD MILHAUS NIXON", we get a checkerboard that looks like this:

        1   2   3   4   5

   1    R  I/J  C   H   A
   2    D   M   L   U   S
   3    N   X   O   B   E
   4    F   G   K   P   Q
   5    T   V   W   Y   Z

No matter how the grid is arranged, however, as stated it's just a monoalphabetic substitution cipher, with pairs of ciphertext digits in place of plaintext letters. It offers little security, though it can be used as a basis for "semaphore" codes, in which a signaler holds two flags in five different positions each to send the full alphabet. However, the checkerboard has much more potential. Each plaintext letter is represented by two ciphertext digits, and that gives a new option for encrypting the message.

Suppose Alice has a checkerboard based on the key "RICHARD MILHAUS NIXON" as above, and the plaintext:

   down with big brother

She can obtain row-and-column indexes for the letters in this plaintext from the grid as follows:

   d:   row = 2 / column = 1
   o:   row = 3 / column = 3
   w:   row = 5 / column = 3
   n:   row = 3 / column = 1

Now she divides the plaintext into blocks of, say, five letters, and writes the indexes vertically underneath the letters:

   downw ithbi gbrot her
   23535 15131 43135 131
   13313 21442 24131 451

To produce the final encryption, she concatenates the two rows on a block-by-block basis:

   2353513313 1513121442 4313524131 131451

The message is divided into blocks to make encryption and decryption more manageable; there's no particular reason to use a block size of five, any other reasonable value will work as well. This particular scheme breaks down each letter into two cipher letters, a concept known as "fractionation"; and then it breaks the message down into blocks and concatenates the two halves, a concept known as "seriation". The resulting message cannot be cracked by simple frequency analysis. This scheme is known as a "bifid" cipher, and was introduced by a Frenchman named Felix Marie Delastelle in a book published in 1901.

It is also possible to "map" plaintext letters to three or more digits that can then be transposed, and such a three-digit mapping is called a "trifid" cipher. Some variations on fractionating ciphers are extremely devious and hard to crack.



* The First World War introduced wireless telegraphy into military operations, greatly extending the communications revolution begun by the telegraph. However, radio communications placed new demands on security, since it was so very easy to intercept messages sent over the airwaves.

Since warships at sea could now communicate with shore stations over wireless, navies adopted code systems. Naval codebooks were often bound with metal plates in the covers so they could be thrown overboard in an emergency and sink. Since World War I on land quickly bogged down on land into immobile trench warfare, armies also adopted codes for frontline operations. There was relatively little problem in distributing the codebooks, the armies not being on the move, and "trench codes" came into common use.

Sophisticated cipher systems were also developed. One of the most significant achievements in cryptography in the First World War was a cipher that was, and remains, uncrackable even in principle. As is typical of black magic, it had a major catch.

A Vigenere cipher based on a keyword becomes more secure with a longer keyword. It also becomes more secure if the keyword is made more unpredictable, to prevent a codebreaker from using guesswork to determine the keyword. That means that the most secure possible Vigenere cipher is one where the keyword is as long as the message, with the keyword made up of completely unpredictable random characters.

To implement such a cipher, sets of random keys could be printed on a pad of paper, with each page on the pad used once and then thrown away. As a result, this scheme is called a "one-time pad" cipher. The interesting thing about the one-time pad is that it is logically impossible to crack by analytic means. Imagine being given a page of completely random letters: it's impossible to "crack" because there's no message there, it's just noise. Taking a message and changing all the characters at random also results in a page of noise, and it's just as uncrackable. The only way the message can be extracted is by telling the recipient of the message with a one-time pad what random changes were made to the letters in the text.

A one-time pad cipher has an interesting property: it is not so much true that no signal can be extracted from pure noise, but that any signal can be extracted from noise. Since there's no fixed pattern in the ciphertext or the key, a key can be easily synthesized to produce every possible message that will fit into the number of letters, such as instructions to attack the enemy, a shopping list, or dirty limericks.

* The catch to the one-time pad cipher is that a specific random key can only be used once to encipher one message. If two messages are enciphered using the same key, then it is no longer impossible to crack the cipher, since a (painful) exercise in multiple anagramming could be used to decipher the texts, trying every possible key on both messages until both messages are revealed. This makes the one-time pad cipher logistically clumsy to deal with, worse than a code, requiring distribution of unique pads to every user -- and so it is only used in very high-security communications.



* The "cipher disk", mentioned above, was one of the first cipher machines, invented in the 15th century. The cipher disk consists of two nested disks, including an outer disk labeled with all the letters of the alphabet, and an inner disk also labeled with all the letters of the alphabet -- but not necessarily in the same order. The outer disk defines the plaintext alphabet, while the inner disk defines a monoalphabetic substitution cipher alphabet.

a cipher disk

Suppose Alice takes her cipher disk and rotates the inner disk so that the letter "A" lines on the inner disk lines up with, say, the letter "q" on the outer disk. She can then encipher a message by taking each letter, looking up the plaintext letter on the outer disk, and then writing down the matching ciphertext letter next to it on the inner disk. The cipher disk can be also made in the form of a "slide", with two alphabets written on strips of cardboard, or whatever, that can be slid next to each other. Of course, each slide should have two repeating alphabets to permit them to be read conveniently when offset.

The cipher as described remains a feeble monoalphabetic substitution cipher, which is not only easy to crack, but also so easy to use that the cipher disk hardly seems very handy. Where the cipher disk comes in useful is in dealing with a Vigenere cipher.

For example, suppose Alice wants to encrypt a message using a Vigenere cipher with the cipher key word "WARTHOG". To encipher the first letter, she moves the "W" on the inner disk to match up with the "a" on the outer disk, and then finds the ciphertext letter on the inner disk matching the desired plaintext letter on the outer disk. Next, Alice enciphers the second letter on the inner wheel by moving the "A" on the inner disc to match the "a" on the outer disk, then looking up the ciphertext letter on the inner disk that matches the desired plaintext letter on the outer disk. She repeats this process for the rest of the keyword, and then starts all over again at the beginning of the keyword.

* The cipher disk led to a more sophisticated cipher machine, invented late in the 19th century by a French military officer named Etienne Bazeries, and known as the "Bazeries cylinder". (The basic idea was actually invented by American President Thomas Jefferson a century earlier, but then forgotten.) A Bazeries cylinder consists of a set of roughly 20 to 30 numbered disks, with a different cipher alphabet on the edge of each disk, and a hole in the center of the disks to allow them to be stacked on an axle. The disks are removeable and can be mounted on the axle in any order desired. The order of the disks can be considered the cipher key for the Bazeries cylinder, with both Alice and Bob arranging the disks in the same predefined order.

To encrypt a message, Alice rotates the disks to produce the plaintext message along one "row" of the stack of disks, and then selects another row as the ciphertext. To decrypt the message, Bob rotates the disks on his cylinder to produce the ciphertext along a row. It is handy if both Alice and Bob know the offset of the row, but not really necessary since Bob can simply look around the cylinder to find a row that makes sense. It was also possible to implement the same scheme with strips containing cipher alphabets inserted on a frame. Both schemes were used in World War I and well into World War II; they provided strong encryption, but they were hardly indecipherable.

simple cipher devices:  strip and disk ciphers

* After World War I, a number of much more sophisticated cipher machines were built, the most famous being the "Enigma", patented in 1918 by a German engineer named Arthur Scherbius.

As it emerged, the Enigma was a wooden box with a keyboard and a bank of lettered lights corresponding to the keys. To encrypt a message, a plaintext character was typed in, and after scrambling the appropriate light was turned on to give the ciphertext character. The ciphertext was then sent using Morse code over radiotelegraph. The operation of the machine was "reciprocal", in that if the "A" key was pressed and lit up the "Z" lamp, then pressing the "Z" key would light up the "A" lamp. That meant that at the receiving end, as long as the operator had an Enigma machine set up in same way, he could just type in the ciphertext character to get the plaintext character: the procedure for encryption and decryption was the same.

The scramblings between the input and output were performed by what was called a "rotor" system. A rotor was a wheel with 26 electrical contacts on each side and scrambled wiring linking the contacts on the two sides. The scrambled wiring represented a mixed substitution cipher alphabet. There were three rotors in series fed into one another, with this assembly referred to as a "basket".

Every time a key was pressed, the first rotor would turn one position. Every time the first rotor went full circle (after 26 keypresses), the second rotor would turn one position. Every time the second rotor went full circle (after 26 * 26 = 676 keypresses), the third rotor would turn one position. When the third rotor went full circle (after 26 * 26 * 26 = 17,576 keypresses), the sequence would start all over again.

The third rotor was wired into a disk called the "reflector" that simply rerouted the connections on the output of the third disk back into themselves. In operation, the electrical signal from the input key ran up through the basket of rotors to the reflector, and then back down through the basket to the output lamp. This is how the Enigma achieved reciprocal operation, allowing both encryption and decryption using the same configuration.

reciprocal operation of Enigma cipher machine

Each of the rotors had a different wiring scheme. The three rotors were removeable and could be rearranged in six different ways. It was also possible in principle to have a stockpile of more than three rotors and select three from the set for use, further multiplying the possibilities. The Enigma also had a plugboard that allowed patch cords to be swapped around to mix six different sets of letters. Setup of the Enigma involved selecting the order of the three rotors, selecting their starting positions, and selecting the patch cord arrangement. This setup amounted to the "key" or "daykey" for the Enigma cipher, and there were a vast number of possible daykeys.

* The German military standardized on the Enigma for encryption before World War II. Despite the complexity of the Enigma cipher, Polish cryptanalysts were able to figure out patterns in the operation of the Enigma that allowed them to determine the daykey, using a set of Enigma simulators known for some obscure reason as a "bombe".

However, in 1938 the Germans improved their Enigma security by providing five rotors, increasing the number of possible configurations of three rotors selected from the set, and increased the number of pairs of letters swapped by the plugboard to 10. That greatly multiplied the possible numbers of daykey configurations, demanding a much more complicated and expensive bombe to figure out the settings. By this time, Nazi dictator Adolf Hitler was making threatening noises against Poland, and so the Poles got in touch with British intelligence to tell them how Enigma had been cracked.

The Nazis invaded Poland in September 1939, setting off World War II. They quickly overran the country. The British were hard-pressed by defeats early in the war, and to help fight back set up a sophisticated codebreaking operation at an estate north of London named Bletchley Park. Although the Germans had tightened up their procedures for Enigma operation, eliminating the loopholes that the Poles had exploited to ferret out daykeys, Bletchley Park cryptanalysts -- most significantly a brilliant Oxford mathematician named Alan Turing -- were able to develop a new, more sophisticated bombe system to once again crack Enigma.



* A range of cryptographic schemes were used in World War II, including traditional military codes, much like the naval codes and trench codes used in World War I, as well as the Bazeries cylinder and slide schemes. The war also saw considerable use of "telecipher" systems for high-level secure communications.

A telecipher system was an encrypted teletypewriter. Teletypewriters used a sequence of on-off electrical signals to transmit and receive letters, using a scheme known as the "Baudot code". If the "on" signal is represented as a "1" and the "off" signal represented as a "0", then the Baudot code had the form of a set of "binary" numbers as follows:

   00011      A      -
   11001      B      ?
   01110      C      :

This scheme uses five "binary digits" or "bits" per letter; there were "shift" characters to permit use of a secondary set of numbers and punctuation characters.

A telecipher machine scrambled the "stream" of Baudot plaindata using what was known in the old days as "modulo-two arithmetic" and is now known as an "exclusive OR" or "XOR" operation. The basic rules of the XOR operation are as follows:

   0 XOR 0  =  0
   0 XOR 1  =  1
   1 XOR 0  =  1
   1 XOR 1  =  0

In simpler terms, if two bits are XORed that have the same value, the result is a "0". If two bits are XORed that have a different value, the result is a "1". XOR can be performed with multibit values on a bit-by-bit basis -- for example, two five-bit values X and Y can be XORed to give a 5-bit value Z:

   X:  10011
   Y:  00110

   Z:  10101

The XOR operation has the interesting property of being "reversible", in that if two binary values X and Y are XORed together, then the result Z can be XORed with Y to give X again:

   Z:  10101
   Y:  00110

   X:  10011

A telecipher machine used an arrangement of electromagnetic relays to XOR the plaindata with a stream of seemingly random bits as a "mask":

    data:  10100 00001 00110 10010 00100 10100 00110 10000 10010 00001 01010
    mask:  01001 00101 11011 10110 00111 11001 10101 01010 10000 11001 10001

  cipher:  11101 00100 11101 00100 00011 01101 10011 11010 00010 11000 11011

The enciphered binary stream could be deciphered by simply XORing it again with the same mask.

The mask was generated by an arrangement of rotating "wheels", with sets of pins on them that could be set in or out. The wheels could generate very long streams of bits in an unpredictable fashion, and telecipher machine messages could be very hard to crack. In fact, the British developed one of the first electronic computers, named "Colossus" and built with vacuum tubes, to crack German telecipher messages.

* In the postwar period, electronic computers became much more widespread, and modern cryptography is effectively based on computer-based ciphers. Two classes of "digital" ciphers were developed: "stream" ciphers and "block" ciphers.

The old telecipher systems essentially implemented stream ciphers. A stream cipher operates by taking a stream of plaindata bits and XORing them on a bit-by-bit basis with a mask of more or less random bits. The cipher stream can then be deciphered by XORing it again with the same mask. The computer generates the mask using some variation of "pseudo-random" bitstream generation algorithm. Block ciphers, in contrast, break up the stream of data bits and perform a complicated set of shufflings and XOR operations on each block. In either case, a "key" consisting of a number of bits is used to specify encryption and decryption.

In the 1970s, the US National Bureau of Standards (NBS, now the National Institute of Standards & Technology / NIST) created a specification for a "Data Encryption Standard (DES)", using a 56-bit block cipher scheme. DES became a standard for most of the rest of the century. Modern digital ciphers will use a key at least 100 bits long.

By the 1970s, use of computing and computer networking was beginning to take off, and some cryptologists were looking forward to the future. One of the issues that was considered was the "key exchange" problem. All ciphers up to that time were "symmetric", meaning the same key was used to encrypt and decrypt a message. However, symmetric ciphers posed some problems in the emerging world of computer networking. In modern terms, suppose customers are visiting a business website and want to establish secure communications, say to make a credit-card payment. In that case, the business would have to exchange a different key with each customer to guarantee security, which would be very cumbersome.

In 1975 a cryptologist named Whitfield Diffie at Stanford University in California came up with an idea that seems almost obvious in hindsight but must have sounded preposterous at the time: "public key" cryptography. The idea was that a public key cipher would have two keys: a "public" key, available to anyone, that can be used to encipher a message; and a "private" key, known only to one person, that can be used to decipher a message. The bizarre thing about a public key cipher is that the public key can be used to encipher a message, but once that message is enciphered, only the private key can decipher the message.

In public key cryptography, if Bob wants to send an encrypted message to Alice, he obtains her public key. Alice could put her public key on her website or business card if she liked for anyone to use. Bob uses Alice's public key to encipher a message and then sends it off. He doesn't need to worry about anyone reading the message, since the public key can't be used to decrypt it. Only Alice can decrypt it, using her private key.

Diffie had no idea of how to do this, but in 1975 he and his colleagues published the concept anyway to inspire other cryptologists. About two years later, three researchers at the Massachusetts Institute of Technology (MIT) computer science department named Ron Rivest, Adi Shamir, and Ron Adleman announced that they had developed a public key cipher, named "RSA" after their initials. RSA, and other public-key ciphers that followed, had limitations in that they required a lot of computing power and were relatively easy to crack. As a result, they weren't used to encipher long messages, instead being used to encipher a key for a block cipher that was used to send the actual message.

* One of the interesting things that could be done with public-key cryptography was "message authentification" -- that is, providing a guarantee that a message was actually sent by the person who was supposed to have sent it. Normally, in a public-key cipher, a message is encrypted with the public key and decrypted with the private key. It also works the other way around: a message can be encrypted with the private key and decrypted with the public key. Such an "inside-out" encryption, to give it a name, sounds absurd, since anyone could decrypt the message, but the point is that only the person with the private key could have encrypted it.

To validate a message, a "hash" can be made of the message. A hash is a relatively short string of bits that is created from the message, with the interesting property that the slightest change in the message will result in a very different hash. Alice can send an encrypted message to Bob using normal cryptographic methods, and then can send a hash to Bob using inside-out encryption. Bob extracts the hash with Alice's public key and then performs a hash on the decrypted message to see if it matches the hash sent by Alice. If Alice didn't really send the message, Bob won't be able to decipher it; if the message has been tampered with, Bob's hash won't match.

There's another aspect of message authentification. Suppose Bob hasn't heard from Alice in years, has lost track of her, and then starts getting messages from her again. How does he know the messages are really from Alice and not somebody, say Zelda, pretending to be Alice? Zelda can hand him her public key and Bob would be no wiser.

If Bob wants to make sure Alice really is Alice, he can obtain a "digital certificate" or "cert" from a "certification server". These are secure network servers operating on a "trusted third party" basis that store the public keys associated with specific users, and provide them on demand. Bob can get the cert associated with Alice from a certification server and then check to see if the public key he has matches that in Alice's cert.

* Modern digital encryption schemes are commonly used on the internet. For one particularly important example, most people who surf the internet to make online purchases have used cryptography to provide their credit-card numbers to vendors, using the "Secure Sockets Layer (SSL)" protocol. Pages protected by SSL are designated with an "https" prefix instead of the conventional "http" prefix.

SSL works more or less transparently to the user, and uses both public-key and symmetric ciphers. Supposed Alice wants to make a purchase from an online vendor and needs to give the vendor her credit-card number. Her web browser will download the vendor's public key, validate it against the vendor's cert -- a web browser usually maintains a list of certs for prominent online businesses -- and use it to encrypt a secret symmetrical key produced automatically at (more or less) random by the Web browser. The Web browser will pass the encrypted secret key to the vendor, and this secret key will be used to encrypt the session. Actually, several secret keys are exchanged, but a detailed discussion of the complexities is a bit beyond the scope of this document.



* As a footnote to the subject of cryptology, one of the more dubious aspects of the profession is the hunt for hidden ciphers in great works of literature, particularly the Bible. For want of a better term it might be called "pseudocryptology". The usual approach of pseudocryptologists is to scan through the text of interest and take out letters at intervals, for example every 13th letter, every 100th letter, or every 1,617th letter. They then arrange the letters in a block and see if patterns pop out. The patterns are typically identified as scrambled words buried in a stream of gibberish. It is not too surprising that one can actually extract seemingly sensible remarks using this scheme. It's about on a logical level with, say, playing the music of the Beatles backwards and listening for secret messages left by John Lennon.

A modern advocate of such hidden codes, Michael Drosnin, has published a series of BIBLE CODE books that made predictions of a wide number of events hidden in the Bible. The books made the best-seller lists, despite the fact that most of the predictions identified by Drosnin in his books were, conveniently, of events that had already happened. His predictions of actual future events, such as the end of the world in the year 2000, were somewhat inaccurate.

Drosnin incautiously challenged critics to find similar predictions about, say, the assassination of a prime minister in MOBY DICK, and so an Australian mathematician named Brendan McKay went out and did precisely that, identifying "predictions" of nine such assassinations. Similar exercises were performed on WAR AND PEACE and other classics. However, pseudocryptology has been around for a long time -- and though obviously bogus, it still refuses to die off.

* In 2001 I released the first edition of the multichapter online document CODES, CIPHERS, & CODEBREAKING, which has evolved into a fairly thorough and extensive discussion of the subject. It's proven reasonably popular and I'm basically pleased with it, but its detail created an unavoidable problem: it was too much for a lot of people to want to read.

As a result, I decided to render it down into a one-chapter primer document for those who were curious but didn't want to do a lot of work. Those who have read this primer and are interested in further investigation can go on to the parent document. Since all this document amounts to is a grossly abridged version of the parent document, no sources are cited here.

* Revision history:

   v1.0.0 / 01 may 07 
   v1.0.1 / 01 apr 09 / Review & polish.
   v1.0.2 / 01 oct 10 / Review & polish.
   v1.0.3 / 01 aug 11 / Review & polish.
   v1.0.4 / 01 jul 13 / Review & polish.