[JoGu]

Cryptology

Running-Text Ciphers – Cryptanalysis According to FRIEDMAN

a7Hzq .#5r<
kÜ\as TâÆK$
ûj(Ö2 ñw%h:
Úk{4R f~`z8
¤˜Æ+Ô „&¢Dø

FRIEDMAN proposed a systematic approach to solving running-key ciphers in the article

W. F. Friedman: Methods for the Solution of Running-Key Ciphers.
Riverbank Publication No. 16 (1918).
In: The Riverbank Publications Vol 1, Aegean Park Press 1979.


Consider a running-text cryptogram. FRIEDMAN's method starts from the observation that a significant fraction of the ciphertext letters arise from a combination of two frequent plaintext letters.

The frequency distribution of the nine most frequent German letters is:

ENI RSA TDU
18.0 % 10.6 % 8.1 % 7.2 % 6.9 % 6.0 % 5.7 % 5.4 % 4.6 %
Therefore these letters account for 72.5 % of a German text.

Assuming that the key is sufficiently independent of the plaintext we expect that about 53 % ciphertext letters arose from a combination of two of these letters in plaintext or key. This fact is not overly impressive. In the example

                                  |   | | |     |   |
  Plaintext:  i c h k o m m e m o r g e n u m z e h n
  Key:        V O M E I S E B E F R E I T S I N D S T
              ---------------------------------------
  Ciphertext: D Q T O W E Q F Q T I K M G M U M H Z G
this applies only to 6 of 20 letters. The method won't work well for this example.


Let us take another example (from the football world championships 2002):

              | | | | |       | |         | |   |       |   | |  
  Plaintext:  d e u t s c h l a n d b e s i e g t p a r a g u a y
  Key:        E I N E N A T U E R L I C H E S P R A C H E H A T T
              ---------------------------------------------------
  Ciphertext: H M H X F C A F E E O J G Z M W V L P C Y E N U T R
Here we see 13 of 26 letters as interesting. We use this example to explain the method.

Let's begin with the first four letters, and consider all combinations that lead to them

  Plaintext:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Key:        H G F E D C B A Z Y X W V U T S R Q P O N M L K J I
                    | |                 |             |          
  Ciphertext: H H H H H H H H H H H H H H H H H H H H H H H H H H

  Plaintext:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Key:        M L K J I H G F E D C B A Z Y X W V U T S R Q P O N
                      |       |                   | | |          
  Ciphertext: M M M M M M M M M M M M M M M M M M M M M M M M M M

  Plaintext:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Key:        H G F E D C B A Z Y X W V U T S R Q P O N M L K J I
                    | |                 |             |          
  Ciphertext: H H H H H H H H H H H H H H H H H H H H H H H H H H

  Plaintext:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Key:        X W V U T S R Q P O N M L K J I H G F E D C B A Z Y
                    | |                             | |          
  Ciphertext: X X X X X X X X X X X X X X X X X X X X X X X X X X
The most probable pairs are flagged. We condense this observation:
  DENU EISTU DENU DETU
  EDUN IEUTS EDUN UTED
  H    M     H    X
There is a total of 4·5·4·4 = 320 possible combinations of these pairs. Some of them may be eliminated immediately, for example we may exclude that plaintext or key begin with the letters DS.

If we start with the pair DE we might continue with EI or US. The first case has only one meaningful continuation:

  DEUT
  EINE
The second case could proceed with DE, but no fourth pair fits. A possible pair number 3 is NU also, and then ET or TE as pair number 4. Therefore we note two more options, both of them not really convincing:
  DEUT  DUNE  DUNT
  EINE  ESUT  ESUE

Starting with ED we find an exactly symmetric situation and get the same three options but with plaintext and key interchanged.

Starting with NU we might continue with IE or US. The first case has ED as only plausible continuation, and then TE:

  DEUT  DUNE  DUNT  NIET
  EINE  ESUT  ESUE  UEDE
The second case could proceed with DE (and then ET) or NU (and then there is no good continuation). So we found one more option:
  DEUT  DUNE  DUNT  NIET  NUDE
  EINE  ESUT  ESUE  UEDE  USET
Taking all the symmetric ones into account we face a total of 10 somewhat plausible options—under the assumption that the first four letters of plaintext and key belong to the nine most frequent German letters.

Of our five (+ five symmetric) options the first looks best. But also the fourth is reasonably good, bearing in mind that the keytext might begin in the middle of a word (for example »müde« = (M)UEDE). In any case let's begin with the first option that looks somewhat better. It suggests the continuation SCH. This seems promising:

  DEUTSCH
  EINENAT
Of course if this fails we would also try for example DEUTLICH or DEUTEN.

As next letter in the first row we would try E or L and note that L gives a better continuation in the second row (U better than B). Therefore the begin DEUTSCHLAND is decrypted—but we don't yet know whether it is plaintext or key. From this point we struggle ahead in zigzag as noted before.


Author: Klaus Pommerening, 2002-Jun-16; last change: 2014-Aug-06.