[JoGu]

Cryptology

Cryptanalysis of an Autokey Cipher

a7Hzq .#5r<
kÜ\as TâÆK$
ûj(Ö2 ñw%h:
Úk{4R f~`z8
¤˜Æ+Ô „&¢Dø

The Cryptogram

Suppose we got the ciphertext

   LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI
   WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX IZYHQ LNSYM FWUEQ TELFH
   QTELQ IAXXV ZPYTL LGAVP ARTKL IPTXX CIHYE UQR

The context suggests that the plaintext language is French.

Here are some statistics. The letter count

ABCDE FGHIJ KLMNO PQRST UVWXY Z
8 13 1 9 6 1410 1 4114 3 3 5 76 510 4 53 9 6 9

as well as the coincidence index 0.0437 suggest a polyalphabetic cipher, the autocincidence spectrum shows no meaningful period. The frequency distribution of the single letters hints at a running-key or autokey cipher that uses the standard alphabet (= TRITHEMIUS table).


A Probable Word

Since the message probably originated from the french embassy at Berlin in 1870 we may assume that the plaintext contains the word »allemand«. Moving this word along the ciphertext—with the help of the Perl script probwd.pl—we get 4 good matches (plus some weak ones):

Position  Substring  MFL rate  BLW score

    000   LJHEHFFX      0,25      4,1
    001   UHXPTSNQ      0,50      4,3
    002   SXIBGAGJ      0,38      4,2
    003   IIUOOTZQ  --> 0,75      3,0
    004   TUHWHMGW      0,25      1,4
    005   FHPPATMG      0,25      5,7
    006   SPIIHZWF      0,38      2,9
    007   AIBPNJVW      0,38      2,9
    008   TBIVXIMP      0,38      5,8
    009   MIOFWZFV      0,25      4,0
    010   TOYENSLA  ==> 0,88 --> 10,5 <===
    011   ZYXVGYQW      0,00      0,3
    012   JXOOMDMJ      0,25      1,9
    013   IOHURZZM      0,50      4,3
    014   ZHNZNMCJ      0,25      0,3
    015   SNSVAPZC      0,50      6,6
    016   YSOIDMSF      0,50      6,2
    017   DOBLAFVW      0,38      7,0
    018   ZBEITIMO      0,62      8,7
    019   MEBBWZEB      0,25      4,5
    020   PBUENRRT  --> 0,75      8,4
    021   MUXVFEJI      0,38      4,4
    022   FXONSWYO      0,50      4,2
    023   IOGAKLEW      0,62      6,0
    024   ZGTSZRMB      0,38      4,5
    025   RTLHFZRH      0,50      3,9
    026   ELANNEXI  ==> 0,88 --> 11,9 <===
    027   WAGVSKYP      0,25      1,5
    028   LGOAYLFO      0,62      3,2
    029   ROTGZSEN  --> 0,75      7,9
    030   ZTZHGRDU      0,38      4,3
    031   EZAOFQKZ      0,38      1,7
    032   KAHNEXPX      0,38      3,3
    033   LHGMLCNQ      0,38      2,3
    034   SGFTQAGC      0,38      3,4
    035   RFMYOTSB      0,50      4,2
    036   QMRWHFRK      0,25      1,0
    037   XRPPTEAB      0,50      8,0
    038   CPIBSNRV      0,50      4,7
    039   AIUABELY  --> 0,75      8,5
    040   TUTJSYOS  --> 0,75      5,3
    041   FTCAMBIL      0,50      8,1
    042   ECTUPVBF      0,38      6,1
    043   NTNXJOVT      0,62      4,7
    044   ENQRCIJX      0,50      5,8
    045   YQKKWWNE      0,25      2,4
    046   BKDEKAUF      0,38      5,2
    047   VDXSOHVB      0,25      2,5
    048   OXLWVIRI      0,62      5,7
    049   ILPDWEYI      0,50      2,5
    050   WPWESLYU      0,50      5,1
    051   AWXAZLKC      0,38      0,0
    052   HXTHZXSH      0,25      2,8
    053   ITAHLFXS      0,62      5,5
    054   EAATTKIU  ==> 0,88      5,6
    055   LAMBYVKL      0,38      4,8
    056   LMUGJXBH      0,25      2,3
    057   XUZRLOXW      0,50      3,5
    058   FZKTCKML      0,25      1,0
    059   KKMKYZBS      0,12      0,3
    060   VMDGNOIN      0,50      6,3
    061   XDZVCVDF      0,00      0,6
    062   OZOKJQVM      0,25      0,3
    063   KODREICQ      0,50      6,2
    064   ZDKMWPGX      0,00      0,0
    065   OKFEDTNR      0,62      5,2
    066   VFXLHAHK      0,25      2,0
    067   QXEPOUAU      0,62      9,4
    068   IEIWINKX      0,62      5,1
    069   PIPQBXNO      0,38      3,2
    070   TPJJLAEW      0,50      3,9
    071   AJCTORMZ      0,50      6,4
    072   UCMWFZPU      0,25      2,7
    073   NMPNNCKF      0,38      5,1
    074   XPGVQXVW      0,00      0,3
    075   AGOYLIMV      0,50      5,1
    076   RORTWZLE  --> 0,75      7,6
    077   ZRMENYUN      0,62      8,7
    078   CMXVMHDI      0,12      2,0
    079   XXOUVQYK      0,25      3,7
    080   IONDELAP  --> 0,75 ==> 14,0 <===
    081   ZNWMZNFV      0,25      1,1
    082   YWFHBSLJ      0,25      2,6
    083   HFAJGYZC      0,12      2,0
    084   QACOMMST      0,50      8,0
    085   LCHUAFJR      0,50      5,3
    086   NHNITWHB      0,50      3,5
    087   SNBBKURN      0,62      3,9
    088   YBUSIEDQ      0,50      7,9
    089   MULQSQGB      0,38      4,0
    090   FLJAETRI  --> 0,75      6,8
    091   WJTMHEYC      0,25      3,0
    092   UTFPSLSE  --> 0,75      7,7
    093   EFIAZFUN      0,62      5,7
    094   QITHTHDQ      0,38      3,7
    095   TTABVQGB      0,38      5,0
    096   EAUDETRI  ==> 0,88 ==> 12,6 <===
    097   LUWMHEYN      0,50      3,2
    098   FWFPSLDF      0,25      3,0
    099   HFIAZQVX      0,25      1,9
    100   QITHEINU  --> 0,75      8,6
    101   TTAMWAKU      0,62      5,0
    102   EAFEOXKS      0,62      5,1
    103   LFXWLXIW      0,38      1,1
    104   QXPTLVMM      0,25      3,7
    105   IPMTJZCV      0,25      1,4
    106   AMMRNPLQ      0,50      6,0
    107   XMKVDYGI      0,12      0,6
    108   XKOLMTYI      0,50      1,7
    109   VOEUHLYD      0,50      4,3
    110   ZENPZLTX      0,50      4,2
    111   PNIHZGNS      0,50      4,6
    112   YIAHUAIM      0,62      6,1
    113   TAACOVCX      0,50      6,5
    114   LAVWJPNO      0,50      5,3
    115   LVPRDAEQ      0,50      7,0
    116   GPKLORGH      0,38      3,3
    117   AKEWFTXI      0,50      0,8
    118   VEPNHKYF      0,25      3,9
    119   PPGPYLVM      0,12      0,9
    120   AGIGZICQ      0,38      4,2
    121   RIZHWPGU      0,38      2,8
    122   TZAEDTKU      0,62      2,7
    123   KAXLHXKZ      0,25      1,4
    124   LXEPLXPF      0,38      4,1
    125   IEITLCVE  --> 0,75      9,5
    126   PIMTQIUV      0,50      4,5
    127   TMMYWHLB      0,25      2,8
    128   XMREVYRR      0,50      5,1
    129   XRXDMEHN      0,38      4,0

The tests for language recognition were used to weight the matches.


Four Good Matches

The first good match occurs at position 10:

            1
01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ
            ALLEM AND
            TOYEN SLA
A plausible completion to the left could be CITOYENS, giving
            1
01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ
         RE ALLEM AND
         CI TOYEN SLA

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The second good match occurs at position 26:

            1           2           3
01234 56789 01234 56789 01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA
                               ALLE MAND
                               ELAN NEXI
A plausible completion to the right could be LANNEXIONDE (»l'annexion de«), so we get
            1           2           3
01234 56789 01234 56789 01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA
                               ALLE MANDE ENT
                               ELAN NEXIO NDE

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The third good match occurs at position 80:

7           8           9
01234 56789 01234 56789 01234 56789
TAUNX ARZCX IZYHQ LNSYM FWUEQ TELFH
            ALLEM AND
            IONDE LAP
The previous letter could be T (»...tion de la p...«), providing not much help:
5           6           7           8           9
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX IZYHQ LNSYM FWUEQ TELFH
                                  E ALLEM AND
                                  T IONDE LAP

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

And the fourth good match at position 96 also is not helpful:

8           9          10          11
01234 56789 01234 56789 01234 56789 01234 56789
IZYHQ LNSYM FWUEQ TELFH QTELQ IAXXV ZPYTL LGAVP
                   ALLE MAND
                   EAUD ETRI


Zig-Zag Exhaustion

The four good matches occur as two pairs whose positions differ by 16. This is a bit of evidence for an autokey cipher with a 16 letter key.

This is easily tested: If we really have an autokey cipher, then the fragments should match at another position too, preferably 16 positions apart.

Let's try the longest one, ELANNEXIONDE, at position 26. We expect exactly one match beside the one we already know, at position 26 - 16 = 10, or 26 + 16 = 42. And we get:

000: HJSVGBVSFZQV
001: QHIGSODLYGWF
002: OXTSFWWEFMGE
003: EIFFNPPLLWFV
004: PUSNGIWRVVWO
005: BHAGZPCBUMPU
006: OPTZGVMALFVZ
007: WIMGMFLRELAV
008: PBTMWECKKQWI
009: IIZWVVVQPMJL
010: POJVMOBVLZMI
011: VYIMFUGRYCJB
012: FXZFLZCEBZCE
013: EOSLQVPHYSFV
014: VHYQMISERVWN
015: ONDMZLPXUMOA
016: USZZCIIALEBS
017: ZOMCZBLRDRTH
018: VBPZSECJQJIN
019: IEMSVVUWIYOV
020: LBFVMNHOXEWA
021: IUIMEAZDDMBG
022: BXZERSOJLRHH
023: EORRJHURQXIO
024: VGEJYNCWWYPN
025: NTWYEVHCXFOM
026: ALLEMANDEENT <===
027: SARMRGOKDDUY
028: HGZRXHVJCKZW
029: NOEXYOUIJPXP
030: VTKYFNTPONQB
031: AZLFEMAUMGCA
032: GASEDTFSFSBJ
033: HHRDKYDLRRKA
034: OGQKPWWXQABU
035: NFXPNPIWZRVX
036: MMCNGBHFQLYR
037: TRAGSAQWKOSK
038: YPTSRJHQNILE
039: WIFRAABTHBFS
040: PUEARUENAVTW
041: BTNRLXYGUJXD
042: ACELORRAINEE <===
043: JTYOIKLOMUFA
044: ANBIBEZSTVBH
045: UQVBVSDZURIH
046: XKOVJWKAQYIT
047: RDIJNDLWXYUB
048: KXWNUEHDXKCG
049: ELAUVAODJSHR
050: SPHVRHOPRXST
...
a perfect accord with our expectations. This gives
3           4           5           6           7
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX
              ELA NNEXI ONDE
              ACE LORRA INEE
and suggests »Alsace-Lorraine«. We complete the middle row that seems to be the keytext:
3           4           5           6           7
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX
          A INELA NNEXI ONDE
          A LSACE LORRA INEE
If we repeat the fragment from row 3 in row 2 at position 55 = 39 + 16 we see the very plausible text »l'annexion de l'Alsace-Lorraine«, and fill up the rows:
3           4           5           6           7
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX
          A INELA NNEXI ONDEL ALSAC ELORR AINEE
          A LSACE LORRA INEET LAFFI RMATI ONDEL
To find the key we go backwards in zig-zag:
            1           2           3           4
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI
                           IR EALLE MANDE ENTRA INELA NNEXI
                           AI NELAN NEXIO NDELA LSACE LORRA
            1           2           3           4
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI
        SCI TOYEN SLAVI CTOIR EALLE MANDE ENTRA INELA NNEXI
        IRE ALLEM ANDEE NTRAI NELAN NEXIO NDELA LSACE LORRA
            1           2           3           4
01234 56789 01234 56789 01234 56789 01234 56789 01234 56789
LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI
AUXAR MESCI TOYEN SLAVI CTOIR EALLE MANDE ENTRA INELA NNEXI
LAVIC TOIRE ALLEM ANDEE NTRAI NELAN NEXIO NDELA LSACE LORRA

Now it's certain that we have an autokey cipher and the key is »Aux armes, citoyens«—a line from the »Marseillaise«. Using the key we easily decipher the complete plaintext:

La victoire allemande entraîne l'annexion de l'Alsace-Lorraine et l'affirmation de la puissance allemande en Europe au détriment de l'Autriche-Hongrie et de la France.

[Consequences of the German victory are the annexation of Alsace-Lorraine and the affirmation of the German power at the expense of Austria-Hungary and France.]


Author: Klaus Pommerening, 2014-May-11; last change: 2014-Jun-11.