CIS 428 Introduction to Cryptography

Barbara Nostrand, Ph.D.

Electrical Engineering and Computer Science

Laboratory 2 - Static Ciphers

The plugboard,  keyboard,  lamps,  and finger-wheels of the rotors emerging from the inner lid of a three-rotor German military Enigma machine.  Enigma implemented a family of poly-alphabitic subsitution ciphers. 


The main classical cipher types are transposition ciphers,  which rearrange the order of letters in a message (e.g. 'help me' becomes 'pleh em' in a trivially simple rearrangement scheme),  and substitution ciphers,  which systematically replace letters or groups of letters with other letters or groups of letters (e.g., 'fly at once' becomes 'gmz bu podf' by replacing each letter with the one following it in the alphabet).  Simple versions of either offered little confidentiality from enterprising opponents,  and still don't.  An early substitution cipher was the Caesar cipher,  in which each letter in the plaintext was replaced by a letter some fixed number of positions further down the alphabet.  It was named after Julius Caesar who is reported to have used it,  with a shift of 3,  to communicate with his generals during his military campaigns,  just like EXCESS-3 code in boolean algebra. 

The Enigma machine,  used in several variants by the German military between the late 1920s and the end of World War II,  implemented a complex electro-mechanical polyalphabetic cipher to protect sensitive communications.  Breaking the Enigma cipher at the Biuro Szyfrów,  and the subsequent large-scale decryption of Enigma traffic at Bletchley Park,  was an important factor contributing to the Allied victory in WWII.  (David Kahn,  The Codebreakers,  1967)

In this laboratory assignment,  we will explore mono-alphabitic subsitution ciphers.  Other ciphers include transposition ciphers which rearrange the characters in amessage.  A simple mono-alphabetic substitution cipher might encode "help me" as "ifmq nf" while a transposition cipher might encode the same message as "pleh em".  Other approaches to secure communication include steganogrpahy or "information hiding".  This includes such well-known devices as invisible ink.  The cipher grille was invented in middle ages to hide text inside of letters. 


In our previous assignment,  we looked a simple rotation cipher which uniformly shifts character codes.  This technique is very weak as once you have decoded a single letter,  you can reliably decrypt the entire message.  An obvious alternative is to construct a random discrete function from the plain-text alphabet to the cipher-text alphabet.  Any cipher which relies upon a single function of this type is called a mono-alphabetic substitution cipher

Various physical devices and aids have been used to assist with ciphers.  One of the earliest may have been the scytale of ancient Greece,  a rod supposedly used by the Spartans as an aid for a transposition cipher.  With the invention of polyalphabetic ciphers came more sophisticated aids such as Alberti's own cipher disk,  Johannes Trithemius' tabula recta scheme,  and Thomas Jefferson's multi-cylinder (reinvented independently by Bazeries around 1900).  Several mechanical encryption/decryption devices were invented early in the 20th century,  and many patented,  including rotor machines -- most famously the Enigma machine used by Germany in World War II.  The ciphers implemented by better quality examples of these designs brought about a substantial increase in cryptanalytic difficulty after WWI.  (James Gannon, Stealing Secrets,  Telling Lies:  How Spies and Codebreakers Helped Shape the Twentieth Century. )

Software Engineering

The first part of your assignment will be to construct an encryption/decription system which: 


Ciphertexts produced by classical ciphers (and some modern ones) always reveal statistical information about the plaintext,  which can often be used to break them.  After the discovery of frequency analysis (perhaps by the Arab polymath al-Kindi) about the 9th century,  nearly all such ciphers became more or less readily breakable by an informed attacker.  Such classical ciphers still enjoy popularity today,  though mostly as puzzles (see cryptogram).  Essentially all ciphers remained vulnerable to cryptanalysis using this technique until the invention of the polyalphabetic cipher,  most clearly by Leon Battista Alberti around the year 1467 (though there is some indication of earlier Arab knowledge of them).  Alberti's innovation was to use different ciphers (i.e., substitution alphabets) for various parts of a message (perhaps for each successive plaintext letter in the limit).  He also invented what was probably the first automatic cipher device,  a wheel which implemented a partial realization of his invention.  In the polyalphabetic Vigenère cipher,  encryption uses a key word,  which controls letter substitution depending on which letter of the key word is used.  In the mid 1800s Babbage showed that polyalphabetic ciphers of this type remained partially vulnerable to frequency analysis techniques.  (David Kahn,  The Codebreakers,  1967)

Although frequency analysis is a powerful and general technique,  encryption was still often effective in practice;  many a would-be cryptanalyst was unaware of the technique.  Breaking a message without frequency analysis essentially required knowledge of the cipher used,  thus encouraging espionage,  bribery,  burglary,  defection,  etc.  to discover it.  It was finally explicitly recognized in the 19th century that secrecy of a cipher's algorithm is not a sensible or practical safeguard;  in fact,  it was further realized any adequate cryptographic scheme (including ciphers) should remain secure even if the adversary fully understands the cipher algorithm itself.  Secrecy of the key should alone be sufficient for a good ciphers to maintain confidentiality under attack.  This fundamental principle was first explicitly stated in 1883 by Auguste Kerckhoffs and is generally called Kerckhoffs' principle;  alternatively and more bluntly,  it was restated by Claude Shannon as Shannon's Maxim - 'the enemy knows the system'. 

Software Engineering

Your task is to construct an automatic statistical analyzer which: 

Letter Frequencies in English

Letter frequency analysis is the key to analyzing mono-alphabetic substitution ciphers.  This technique depends upon the plain-text being composed using typical word frequencies.  This attack can be partially overcome by designing plain-text which avoids common letters such as E or T or which increases the frequency of uncommon letters such a Q or Z.  This becomes increasingly difficult as the length of the plain-text increases.  The user may also choose to use non-standard spelling or insert junk characters.  These techniques are not part of the current assignment. 

Relative Frequencies of Letters in General English Plain Text
From Cryptographical Mathematics, by Robert Edward Lewand

Alphabetical Order

Letter Frequency Letter Frequency
a 0.08167 n 0.06749
b 0.01492 o 0.07507
c 0.02782 p 0.01929
d 0.04253 q 0.00095
e 0.12702 r 0.05987
f 0.02228 s 0.06327
g 0.02015 t 0.09056
h 0.06094 u 0.02758
i 0.06966 v 0.00978
j 0.00153 w 0.02360
k 0.00772 x 0.00150
l 0.04025 y 0.01974
m 0.02406 z 0.00074

Frequency Order

Letter Frequency Letter Frequency
e 0.12702 m 0.02406
t 0.09056 w 0.02360
a 0.08167 f 0.02228
o 0.07507 g 0.02015
i 0.06966 y 0.01974
n 0.06749 p 0.01929
s 0.06327 b 0.01492
h 0.06094 v 0.00978
r 0.05987 k 0.00772
d 0.04253 j 0.00153
l 0.04025 x 0.00150
c 0.02782 q 0.00095
u 0.02758 z 0.00074

Last modified: 2008 JAN 20