[JoGu]

Cryptology

Breaking the Bazeries Cylinder

a7Hzq .#5r<
kÜ\as TâÆK$
ûj(Ö2 ñw%h:
Úk{4R f~`z8
¤˜Æ+Ô „&¢Dø

[Presentation following F. L. Bauer]

Background

Étienne BAZERIES (1846–1931) was a zealous and successful cryptoanalyst. He broke all systems that the French Army used or that were proposed for use, among them also a cipher device by Gaëtan Henri Léon de VIARIS (1847–1901).

As a theoretician BAZERIES was less exceptionally gifted. In 1893 he constructed a cylinder device that de VIARIS immediately disenchanted as an act of satisfaction. Here is how he broke the cipher.


Observation

Each plaintext letter has only a restricted set of possible ciphertext letters. For example the »d« in the 1st generatrix on the disks 1 to 20 transforms to:

    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
 1  E  F  F  C  C  C  P  T  E  I  B  G  N  F  F  K  F  I  F  G
Therefore the set of possible ciphertext letters is {B,C,E,F,G,I,K,N,P,T}.

Analogously with the 2nd, 3rd, 4th, ... generatrix the »d« becomes

    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
 2  F  G  G  B  B  B  R  O  T  E  F  H  S  G  G  P  G  U  G  H
 3  G  H  H  A  Y  Y  I  Y  S  U  G  J  A  J  H  Q  H  A  I  I
 4  H  J  I  Z  U  U  J  M  L  P  H  K  U  K  K  S  J  B  J  J
 5  ...


A Cryptogram

FSAMC RDNFE YHLOE RTXVZ
LRMQU UXRGZ NBOML NDNPV
RTMUK HRDOX LAXOD CREEH
VREXZ GUGLA BSEST VFNGH

This is too short for a frequency analysis of the columns. Also the coincidence test fails [as expected when the encrytion uses different generatrices for different rows].

The cryptanalysis assumes that the device and all its disks are known. Only the order of the disks is secret. (A device that is supposed to be good for military use must be secure also under this assumption!)


Approach: Probable Word

The military context suggests »division« as a probable word. (We don't know whether de VIARIS first tried some other words without success.) If the word

then the following pattern must occur in the ciphertext (where the brackets show the different possible ciphertext equivalents of the plaintext letters above them, joined by a logical OR):
d           i            v         i
[BCEFGIKNPT][BCEHJMNOQRU][BDILTUXY][BCEHJMNOQRU]
s          i            o           n     
[ACELORTUV][BCEHJMNOQRU][GIJMNPRTUY][ACDEFHMOPQST]
This is a significant alleviation of negative pattern search: Instead of 24 (*) variants at each position we have at most 12, and therefore shall encounter less »false alarms«.
(*) The alphabet has 25 letters (no W).

We automatize this search with the help of a small Perl program that uses the string consisting of the eight brackets together as search term: Each position must show exactly one of the letters in the corresponding bracket. In this case we don't get a hit and also fail for the 2nd or 3rd generatrix. For the 4th one we search for the pattern:

d            i                 v              i
[BHIJKLMPSUZ][BCEFLMNOPQRSTUVX][ADEFHJKLNOQRY][BCEFLMNOPQRSTUVX]
s              i                  o            n     
[ABDEIMNOQRUXZ][BCEFLMNOPQRSTUVX][ACEFKMNRSXVZ][ADEGHJQRSTUVXY]

The Perl search for this pattern yields the hit HLOERTXV beginning at position 12. Because only the disk number 3 transforms i to L this disk must be at position 13 of the cylinder. In the same way we find the unique transformations v→O (disk 5), i→E (4), s→R (11), i→T (13), o→X (15), and n→V (12). From this we partially derive the order of the disks:

Position  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
Disk no.  *  *  *  *  *  *  *  *  *  *  *  *  3  5  4 11 13 15 12  *

For the d at position 12 we could select the disks 1 or 11. But disk number 11 is already in use! Therefore at position 12 we must have disk number 1. This gives:

Position    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
Disk no.    *  *  *  *  *  *  *  *  *  *  *  1  3  5  4 11 13 15 12  *


Continuing

Did we find a true hit? Then we should see meaningful pieces of text in distances of 20, 40, and 60 letters. But remember that each block of 20 letters might be encrypted looking at another generatrix.

Indeed the second block an positions 12 to 19 reads

                                   B  O  M  L  N  D  N  P
and in the row immediately above this (at left in our table) we find the letters
                                   a  i  n  m  a  t  i  n
a string that looks very French and suggests that the second block is encrypted using the first generatrix.

In the same way for the third block and the 22th generatrix we get

                                   A  X  O  D  C  R  E  E
                                   d  e  p  a  r  t  a  s
—this looks good—and for block 4 and generatrix 3:
                                   S  E  S  T  V  F  N  G
                                   p  x  x  x  x  x  x  x

This looks odd at first sight. But it doesn't look like random—in fact it seems highly probable that the message is padded with »x« at the end.

This leads to the assumption that the last letter of the plaintext also is x. Thus we have to search for the possible plaintext equivalents of the last ciphertext letter H on the remaining disks using the 3rd generatrix.

We find disks 9 and 14,

             9 C H A R Y B D E T S L F G I J K M N O P Q U V X Z
            14 H O N E U R T P A I B C D F G J K L M Q S V X Y Z
Disk 9 would yield a y in the 3rd block (unlikely). Disk 14 results in e.

The original analysis gave a much nicer reason: Plausible continuations of the third line would be »départ à six heures« or »départ à sept heures«. The first of these would mean a terribly early time for the French Army. This argument shows the importance of context knowledge.

The second text fits perfectly: ptheures at the positions 1 to 8 matches VREXZGUG in the 3rd generatrix, the remaining disks being 6, 7, 18, 6, 9, or 17, 10, 2, or 8, 9. Therefore disk 17 must be at position 5. Alltogether we get:

Position    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
Disk no.   16  7 18  6 17 10  *  9  *  *  *  1  3  5  4 11 13 15 12 14


Finish

Now we have identified all disks except for 4 exemplars. The partial plaintext shows that we are on the right way::

  l  a  t  r  o  i  *  i  *  *  *  d  i  v  i  s  i  o  n  s
  e  p  o  r  t  e  *  a  *  *  *  a  i  n  m  a  t  i  n  s
  u  r  r  e  i  m  *  s  *  *  *  d  e  p  a  r  t  a  s  e
  p  t  h  e  u  r  *  s  *  *  *  p  x  x  x  x  x  x  x  x

We easily guess the missing letters, get the disk order

Position    1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
Disk no.   16  7 18  6 17 10  8  9 20 19  2  1  3  5  4 11 13 15 12 14
and the plaintext
La troisième division se portera demain matin sur Reims STOP
Départ à sept heures STOP xxxxxxxx


Author: Klaus Pommerening, 1997-Jul-11; last change: 2014-Jun-11.