The Secret Language

Ron Hipschman

When you were a kid, did you have a "Captain Midnight" decoder ring? With it, you could send messages to a friends that no one else could read. Or perhaps you remember using special symbols to write notes to your "squeeze" in class. If the note was intercepted , your teacher, could learn nothing about your romance.

In more serious uses, codes and ciphers are used by our military and diplomatic forces to keep confidential information from unauthorized eyes. Businesses also send data that has been encoded to try and protect trade secrets and back-room deals. After all, you wouldn't want your competitor to know that you were about to acquire their company with a leveraged buy-out.

The study of enciphering and encoding (on the sending end), and deciphering and decoding (on the receiving end) is called cryptography from the Greek (kryptos), or hidden and (graphia), or writing. If you don't know Greek (and not many of us do) the above letters could be a form of code themselves! Although the distinction is fuzzy, ciphers are different from codes. When you substitute one word for another word or sentence, like using a foreign language dictionary, you are using a code. When you mix up or substitute existing letters, you are using a cipher. (I told you the difference was fuzzy, and you can combine codes and ciphers by substituting one word for another and then mixing up the result.) We'll concentrate on ciphers.

For a cipher to be useful, several things must be known at both the sending and receiving ends.

  1. The algorithm or method used to encipher the original message (known as the plaintext).
  2. The key used with the algorithm to allow the plaintext to be both enciphered and deciphered.
  3. The period or time during which the key is valid.

By way of analogy, to get into your home you would put a key in a lock to open the door. This process (the use of a key and a lock) is the method or algorithm. Now this method only works if you have the proper key to stick in the lock, and your key will be valid only as long as you are the resident of the particular abode. The next resident will have the locks changed to a different key to make sure that you cannot enter even though you may know the method.

The selection of the above three items - algorithm, key and period - depend on your needs. If you are in the battlefield and are receiving current tactical data, you want an algorithm that makes it easy to decipher the message in the heat of battle. On the other hand, you must also assume that your opponent has intercepted your enciphered message and is busy trying to break it. Therefore you must choose an algorithm (method) that is complicated enough so that by the time your opponent figures it out, the data will be worthless. The easier the algorithm you choose, the more often you will have to change the key that unlocks the code - if you want to keep your enemy in the dark.

Ciphers are broken into two main categories; substitution ciphers and transposition ciphers. Substitution ciphers replace letters in the plaintext with other letters or symbols, keeping the order in which the symbols fall the same. Transposition ciphers keep all of the original letters intact, but mix up their order. The resulting text of either enciphering method is called the ciphertext. Of course, you can use both methods, one after the other, to further confuse an unintended receiver as well. To get a feel for these methods, let's take a look at some ciphers.


Substitution ciphers and decoder rings

We use substitution ciphers all the time. (Actually, substitution ciphers could properly be called codes in most cases.) Morse code, shorthand, semaphore, and the ASCII code with which these characters are being stored in inside my Macintosh are all examples. (ASCII stands for American Standard Code for Information Interchange, just in case you're interested.) The only difference between these and the spy codes is that the above examples are standardized so that everybody knows them.

The Captain Midnight decoder ring (which is an "encoder" ring as well) allows you to do a simple substitution cipher. It usually has two concentric wheels of letters, A through Z. You rotate the outside ring and substitute the letters in your message found on the outside ring with the letters directly below on the inside ring (see diagram). Here, the algorithm is to offset the alphabet and the key is the number of characters to offset it. Julius Caesar used this simple scheme, offsetting by 3 characters (He would have put the "A" on the outer ring of letters over the "D" on the inner ring if he had owned a Captain Midnight decoder ring.) The word "EXPLORATORIUM" thus becomes "HASORUDWRULXP." Such a scheme was easily broken and showed a certain level of naivete on Caesar's part concerning the enemy's intelligence.


Substitution cipher wheels

Click here to download a copy of the cypher wheels (12k PDF). Copy and cut out the two wheels. Place the smaller wheel on top of the larger wheel and rotate them so your "key letter" on the small wheel is beneath the "A" of the large wheel. Now you can encipher your plaintext and pass it to your friend who knows the proper key letter.

You could make your ciphertext a little tougher to decode if you threw 26 pieces of paper into a hat, each with a letter of the alphabet written on it, drew them out one at a time, and put them side-by-side under a normal alphabet. The result might look like this (I just used the order of the keys on my keyboard, so you might call this a "Qwerty" code):

Plaintext letter    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ciphertext letter Q W E R T Y U I O P A S D F G H J K L Z X C V B N M

You can construct a secret message from the above table. Every time you see an "I" you would substitute the "O" beneath and so on for the other characters. The message "Meet me after school behind the gym," would read

"DTTZ DT QYZTK LEIGGS WTIOFR ZIT UND."

Word lengths - especially the short words - give great clues as to the nature of the code (see frequency charts). To help conceal your message, ignore the spaces and break the message up into equal-sized chunks. Five letters are customary in the spy biz, so your message comes out like this (Note that an extra "dummy" character "M" is added at the end to make it come out with a 5-letter group. Your recipient should have no trouble with the extra character.):

DTTZD TQYZT KLEIG GSWTI OFRZI TUNDM

Another popular system called a diagrammatic cipher, used by many children in school, substitutes symbols for letters instead of other letters. This system is, in essence, the same as the letter substitution system, but it's easier to remember than 26 randomly picked letters. It uses the tic-tac-toe boards and two X's as shown below.

The same secret message as above, using the line-shapes that surround each letter (and including a dot where needed) becomes:

Even though it looks like undecipherable outer-space alien text, this would take an arm-chair cryptologist only about 10 minutes or less to figure out. Why? Given enough ciphertext, certain patterns become obvious. Notice how often the empty four-sided box appears: six times out of a total of 29 characters or about 20% of the time. This would immediately indicate that the empty box was almost certainly the symbol for "E," the most frequently used letter in English. Other letters can also be determined by their frequency and by their association with other nearby characters (see "Frequencies"). Almost all substitution ciphers are open to this kind of analysis.

Francis Bacon created one of the more interesting substitution ciphers. He used two different type faces slightly differing in weight (boldness). He broke up his ciphertext into 5 character groups, each of which would represent one character in his plaintext. Depending on which characters of the group were bold, one could determine the plaintext character using the following table (* stands for a plain character and B for a bold character)


A=*****	G=**BB*	M=*BB**	S=B**B*	Y=BB***
B=****B	H=**BBB	N=*BB*B	T=B**BB	Z=BB**B
C=***B*	I=*B***	O=*BBB*	U=B*B**        
D=***BB	J=*B**B	P=*BBBB	V=B*B*B        
E=**B**	K=*B*B*	Q=B****	W=B*BB*        
F=**B*B	L=*B*BB	R=B***B	X=B*BBB        

Our same secret message as above would appear thusly (Bacon's bold and plain characters were less obvious than those below):

         To be or not to be that is the question.
         Whether 'tis nobler in the mind to
         suffer the slings and arrows of
         outrageous fortune or to take arms
         against a sea of troubles and by
         opposing end them?

To decipher, we just break the characters into groups of 5 and use the key above to find the plaintext message.

  M     E     E     T     M     E     B     E  
Tobeo rnott obeth atist heque stion Wheth ertis

  H     I     N     D     T     H     E     G  
noble rinth emind tosuf ferth eslin gsand arrow

  Y     M     A     F     T     E     R     S   
sofou trage ousfo rtune ortot akear msaga insta 

  C     H     O     O     L                    
seaof troub lesan dbyop posin gendt hem?       


Transposition ciphers

Going back to your school days, oo-day oo-yay emember-ray ig-pay atin-lay? Pig-latin is a form of transposition cipher where the original letters are kept intact (albeit with the addition of the suffix "ay"), but rearranged in some way.

Going back way before your school days, to the 5th century B.C., the Spartans used an interesting transposition cipher called a scytale. The scytale utilized a cylinder with a ribbon wrapped helically around it from one end to the other. The message was written across the ribbons, and then unwrapped from the cylinder. Only someone with an identical diameter cylinder could re-wrap and read the message.

The scytale depended on a piece of hardware, the cylinder, which if captured by the enemy, compromised the whole system. Also, the receiver could lose or break the cylinder and therefore lose the ability to decipher any message. It would be better if the method were completely "intellectual" and could be remembered and used without resorting to a physical device.

Since both the sender and receiver of a transposed ciphertext must agree on and remember this algorithm or method for enciphering and deciphering, something easy would be nice. Since geometrical figures are easy to remember, they serve as the basis for a whole class of transposition ciphers. Let's put our message into the shape of a box. Since there are 29 characters, we'll add a dummy ("O") to make 30 and write the message in a six by five box.

M E E T M E
A F T E R S
C H O O L B
E H I N D T
H E G Y M O

We can now transcribe the message by moving down the columns instead of across the rows. Once again we'll break the characters into groups of five to give no clues about word sizes. The result looks like this :

MACEH EFHHE ETOIG TEONY MRLDM ESBTO

The real variety begins when you realize that you don't have to write your plaintext into the box row by row. Instead, you can follow a pattern that zig-zags horizontally, vertically or diagonally, or one that spirals in or spirals out (clockwise or counterclockwise), or many other variations (see diagram below).

Once you've put the text in the chosen form using one route, you can then encipher it by choosing a different route through the text. You and your partner just have to agree on the reading route, the transcription (enciphering) route, and the starting point to have yourselves a system. These systems are called route transcriptions.

Here's our message again. The reading route spirals counterclockwise inward, starting at the lower right corner (left diagram). The transcription route (right diagram) is zig-zag diagonal starting at the lower left corner. The ciphertext becomes:

EAMTN FTDIE EHOTE RHMEM BYESC GLOHO

To decipher, you fill the in box following the zig-zag route and read the message using the spiral route.

Another type of transposition cipher uses a key word or phrase to mix up the columns. This is called columnar transposition. It works like this: First, think of a secret key word. Ours will be the word SECRET. Next, write it above the columns of letters in the square, and number the letters of the key word as they would fall if we placed them in alphabetical order. (If there are duplicate letters, like the "E", they are numbered from left to right.)

5 2 1 4 3 6 S E C R E T M E E T M E A F T E R S C H O O L B E H I N D T H E G Y M O

Now write the columns down in the order indicated by the numbers. The resulting ciphertext looking like this:

ETOIG EFHHE MRLDM TEONY MACEH ESBTO

As you can see, this is just a different arrangement of the previous ciphertext, but at least it isn't in some regular pattern. We could have easily made it a little more difficult by filling the square following a more complicated path. We could also use a geometric shape other than a rectangle and combine substitution and transposition. The only problem that might occur is that the deciphering may become so complicated that it will remain a secret at the receiving end forever! Come to think of it, she never did meet me behind the gym...


Frequencies

Order of frequency of single letters:
E T O A N I R S H D L C W U M F YG P B V K X Q J Z

Order of frequency of digraphs (two letter combinations):
th er on an re he in ed nd ha at en es of or nt ea ti to it st io le is ou ar as de rt ve

Order of frequency of trigraphs:
the and tha ent ion tio for nde has nce edt tis oft sth men

Order of frequency of most common doubles:
ss ee tt ff 11 mm oo

Order of frequency of initial letters:
T O A W B C D S F M R H I Y E G L N P U J K

Order of frequency of final letters:
E S T D N R Y F L O G H A R M P U W

One-letter words:
a, I, 0.

Most frequent two-letter words:
of, to, in, it, is, be, as, at, so, we, he, by, or, on, do, if, me, my, up, an, go, no, us, am...

Most frequent three-letter words:
the, and, for, are, but, not, you, all, any, can, had, her, was, one, our, out, day, get, has, him, his, how, man, new, now, old, see, two, way, who, boy, did, its, let, put, say, she, too, use...

Most frequent four-letter words:
that, with, have, this, will, your, from, they, know, want, been, good, much, some, time, very, when, come, here, just, like, long, make, many, more, only, over, such, take, than, them, well, were...


Bibliography:

Gardner, Martin. Codes, Ciphers and Secret Writing.
New York, NY: Dover Publications Inc., 1972.
A wonderful, fun, and easy to read introduction to codes and ciphers.

Smith, Laurence Dwight. Cryptography, the Science of Secret Writing.
New York, NY: Dover Publications Inc., 1943.
A good account of codes and ciphers with many historical examples.

Konheim, Alan G. Cryptography: A Primer.
New York, NY: John Wiley & Sons, 1981.
A highly technical (and mathematical) book on more modern methods of code making and breaking.

Gaines, Helen Fouché. Cryptanalysis: A Study of Ciphers and their Solution.
New York, NY: Dover Publications Inc., 1956.
The title says it all.


Internet Resources


© 1995, Ron Hipschman