CryptologyCryptanalysis of an Autokey Cipher |
|
Suppose we got the ciphertext
LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX IZYHQ LNSYM FWUEQ TELFH QTELQ IAXXV ZPYTL LGAVP ARTKL IPTXX CIHYE UQR
The context suggests that the plaintext language is French.
Here are some statistics. The letter count
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 | 1 | 3 | 1 | 9 | 6 | 1 | 4 | 10 | 1 | 4 | 11 | 4 | 3 | 3 | 5 | 7 | 6 | 5 | 10 | 4 | 5 | 3 | 9 | 6 | 9 |
as well as the coincidence index 0.0437 suggest a polyalphabetic cipher, the autocincidence spectrum shows no meaningful period. The frequency distribution of the single letters hints at a running-key or autokey cipher that uses the standard alphabet (= TRITHEMIUS table).
Since the message probably originated from the french embassy at Berlin in 1870 we may assume that the plaintext contains the word »allemand«. Moving this word along the ciphertext—with the help of the Perl script probwd.pl—we get 4 good matches (plus some weak ones):
Position Substring MFL rate BLW score 000 LJHEHFFX 0,25 4,1 001 UHXPTSNQ 0,50 4,3 002 SXIBGAGJ 0,38 4,2 003 IIUOOTZQ --> 0,75 3,0 004 TUHWHMGW 0,25 1,4 005 FHPPATMG 0,25 5,7 006 SPIIHZWF 0,38 2,9 007 AIBPNJVW 0,38 2,9 008 TBIVXIMP 0,38 5,8 009 MIOFWZFV 0,25 4,0 010 TOYENSLA ==> 0,88 --> 10,5 <=== 011 ZYXVGYQW 0,00 0,3 012 JXOOMDMJ 0,25 1,9 013 IOHURZZM 0,50 4,3 014 ZHNZNMCJ 0,25 0,3 015 SNSVAPZC 0,50 6,6 016 YSOIDMSF 0,50 6,2 017 DOBLAFVW 0,38 7,0 018 ZBEITIMO 0,62 8,7 019 MEBBWZEB 0,25 4,5 020 PBUENRRT --> 0,75 8,4 021 MUXVFEJI 0,38 4,4 022 FXONSWYO 0,50 4,2 023 IOGAKLEW 0,62 6,0 024 ZGTSZRMB 0,38 4,5 025 RTLHFZRH 0,50 3,9 026 ELANNEXI ==> 0,88 --> 11,9 <=== 027 WAGVSKYP 0,25 1,5 028 LGOAYLFO 0,62 3,2 029 ROTGZSEN --> 0,75 7,9 030 ZTZHGRDU 0,38 4,3 031 EZAOFQKZ 0,38 1,7 032 KAHNEXPX 0,38 3,3 033 LHGMLCNQ 0,38 2,3 034 SGFTQAGC 0,38 3,4 035 RFMYOTSB 0,50 4,2 036 QMRWHFRK 0,25 1,0 037 XRPPTEAB 0,50 8,0 038 CPIBSNRV 0,50 4,7 039 AIUABELY --> 0,75 8,5 040 TUTJSYOS --> 0,75 5,3 041 FTCAMBIL 0,50 8,1 042 ECTUPVBF 0,38 6,1 043 NTNXJOVT 0,62 4,7 044 ENQRCIJX 0,50 5,8 045 YQKKWWNE 0,25 2,4 046 BKDEKAUF 0,38 5,2 047 VDXSOHVB 0,25 2,5 048 OXLWVIRI 0,62 5,7 049 ILPDWEYI 0,50 2,5 050 WPWESLYU 0,50 5,1 051 AWXAZLKC 0,38 0,0 052 HXTHZXSH 0,25 2,8 053 ITAHLFXS 0,62 5,5 054 EAATTKIU ==> 0,88 5,6 055 LAMBYVKL 0,38 4,8 056 LMUGJXBH 0,25 2,3 057 XUZRLOXW 0,50 3,5 058 FZKTCKML 0,25 1,0 059 KKMKYZBS 0,12 0,3 060 VMDGNOIN 0,50 6,3 061 XDZVCVDF 0,00 0,6 062 OZOKJQVM 0,25 0,3 063 KODREICQ 0,50 6,2 064 ZDKMWPGX 0,00 0,0 065 OKFEDTNR 0,62 5,2 066 VFXLHAHK 0,25 2,0 067 QXEPOUAU 0,62 9,4 068 IEIWINKX 0,62 5,1 069 PIPQBXNO 0,38 3,2 070 TPJJLAEW 0,50 3,9 071 AJCTORMZ 0,50 6,4 072 UCMWFZPU 0,25 2,7 073 NMPNNCKF 0,38 5,1 074 XPGVQXVW 0,00 0,3 075 AGOYLIMV 0,50 5,1 076 RORTWZLE --> 0,75 7,6 077 ZRMENYUN 0,62 8,7 078 CMXVMHDI 0,12 2,0 079 XXOUVQYK 0,25 3,7 080 IONDELAP --> 0,75 ==> 14,0 <=== 081 ZNWMZNFV 0,25 1,1 082 YWFHBSLJ 0,25 2,6 083 HFAJGYZC 0,12 2,0 084 QACOMMST 0,50 8,0 085 LCHUAFJR 0,50 5,3 086 NHNITWHB 0,50 3,5 087 SNBBKURN 0,62 3,9 088 YBUSIEDQ 0,50 7,9 089 MULQSQGB 0,38 4,0 090 FLJAETRI --> 0,75 6,8 091 WJTMHEYC 0,25 3,0 092 UTFPSLSE --> 0,75 7,7 093 EFIAZFUN 0,62 5,7 094 QITHTHDQ 0,38 3,7 095 TTABVQGB 0,38 5,0 096 EAUDETRI ==> 0,88 ==> 12,6 <=== 097 LUWMHEYN 0,50 3,2 098 FWFPSLDF 0,25 3,0 099 HFIAZQVX 0,25 1,9 100 QITHEINU --> 0,75 8,6 101 TTAMWAKU 0,62 5,0 102 EAFEOXKS 0,62 5,1 103 LFXWLXIW 0,38 1,1 104 QXPTLVMM 0,25 3,7 105 IPMTJZCV 0,25 1,4 106 AMMRNPLQ 0,50 6,0 107 XMKVDYGI 0,12 0,6 108 XKOLMTYI 0,50 1,7 109 VOEUHLYD 0,50 4,3 110 ZENPZLTX 0,50 4,2 111 PNIHZGNS 0,50 4,6 112 YIAHUAIM 0,62 6,1 113 TAACOVCX 0,50 6,5 114 LAVWJPNO 0,50 5,3 115 LVPRDAEQ 0,50 7,0 116 GPKLORGH 0,38 3,3 117 AKEWFTXI 0,50 0,8 118 VEPNHKYF 0,25 3,9 119 PPGPYLVM 0,12 0,9 120 AGIGZICQ 0,38 4,2 121 RIZHWPGU 0,38 2,8 122 TZAEDTKU 0,62 2,7 123 KAXLHXKZ 0,25 1,4 124 LXEPLXPF 0,38 4,1 125 IEITLCVE --> 0,75 9,5 126 PIMTQIUV 0,50 4,5 127 TMMYWHLB 0,25 2,8 128 XMREVYRR 0,50 5,1 129 XRXDMEHN 0,38 4,0
The tests for language recognition were used to weight the matches.
The first good match occurs at position 10:
1 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ ALLEM AND TOYEN SLAA plausible completion to the left could be CITOYENS, giving
1 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ RE ALLEM AND CI TOYEN SLA- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
The second good match occurs at position 26:
1 2 3 01234 56789 01234 56789 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA ALLE MAND ELAN NEXIA plausible completion to the right could be LANNEXIONDE (»l'annexion de«), so we get
1 2 3 01234 56789 01234 56789 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA ALLE MANDE ENT ELAN NEXIO NDE- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
The third good match occurs at position 80:
7 8 9 01234 56789 01234 56789 01234 56789 TAUNX ARZCX IZYHQ LNSYM FWUEQ TELFH ALLEM AND IONDE LAPThe previous letter could be T (»...tion de la p...«), providing not much help:
5 6 7 8 9 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX IZYHQ LNSYM FWUEQ TELFH E ALLEM AND T IONDE LAP- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
And the fourth good match at position 96 also is not helpful:
8 9 10 11 01234 56789 01234 56789 01234 56789 01234 56789 IZYHQ LNSYM FWUEQ TELFH QTELQ IAXXV ZPYTL LGAVP ALLE MAND EAUD ETRI
The four good matches occur as two pairs whose positions differ by 16. This is a bit of evidence for an autokey cipher with a 16 letter key.
This is easily tested: If we really have an autokey cipher, then the fragments should match at another position too, preferably 16 positions apart.
Let's try the longest one, ELANNEXIONDE, at position 26. We expect exactly one match beside the one we already know, at position 26 - 16 = 10, or 26 + 16 = 42. And we get:
000: HJSVGBVSFZQV 001: QHIGSODLYGWF 002: OXTSFWWEFMGE 003: EIFFNPPLLWFV 004: PUSNGIWRVVWO 005: BHAGZPCBUMPU 006: OPTZGVMALFVZ 007: WIMGMFLRELAV 008: PBTMWECKKQWI 009: IIZWVVVQPMJL 010: POJVMOBVLZMI 011: VYIMFUGRYCJB 012: FXZFLZCEBZCE 013: EOSLQVPHYSFV 014: VHYQMISERVWN 015: ONDMZLPXUMOA 016: USZZCIIALEBS 017: ZOMCZBLRDRTH 018: VBPZSECJQJIN 019: IEMSVVUWIYOV 020: LBFVMNHOXEWA 021: IUIMEAZDDMBG 022: BXZERSOJLRHH 023: EORRJHURQXIO 024: VGEJYNCWWYPN 025: NTWYEVHCXFOM 026: ALLEMANDEENT <=== 027: SARMRGOKDDUY 028: HGZRXHVJCKZW 029: NOEXYOUIJPXP 030: VTKYFNTPONQB 031: AZLFEMAUMGCA 032: GASEDTFSFSBJ 033: HHRDKYDLRRKA 034: OGQKPWWXQABU 035: NFXPNPIWZRVX 036: MMCNGBHFQLYR 037: TRAGSAQWKOSK 038: YPTSRJHQNILE 039: WIFRAABTHBFS 040: PUEARUENAVTW 041: BTNRLXYGUJXD 042: ACELORRAINEE <=== 043: JTYOIKLOMUFA 044: ANBIBEZSTVBH 045: UQVBVSDZURIH 046: XKOVJWKAQYIT 047: RDIJNDLWXYUB 048: KXWNUEHDXKCG 049: ELAUVAODJSHR 050: SPHVRHOPRXST ...a perfect accord with our expectations. This gives
3 4 5 6 7 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX ELA NNEXI ONDE ACE LORRA INEEand suggests »Alsace-Lorraine«. We complete the middle row that seems to be the keytext:
3 4 5 6 7 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX A INELA NNEXI ONDE A LSACE LORRA INEEIf we repeat the fragment from row 3 in row 2 at position 55 = 39 + 16 we see the very plausible text »l'annexion de l'Alsace-Lorraine«, and fill up the rows:
3 4 5 6 7 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 ZEKLS RQXCA TFENE YBVOI WAHIE LLXFK VXOKZ OVQIP TAUNX ARZCX A INELA NNEXI ONDEL ALSAC ELORR AINEE A LSACE LORRA INEET LAFFI RMATI ONDELTo find the key we go backwards in zig-zag:
1 2 3 4 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI IR EALLE MANDE ENTRA INELA NNEXI AI NELAN NEXIO NDELA LSACE LORRA 1 2 3 4 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI SCI TOYEN SLAVI CTOIR EALLE MANDE ENTRA INELA NNEXI IRE ALLEM ANDEE NTRAI NELAN NEXIO NDELA LSACE LORRA 1 2 3 4 01234 56789 01234 56789 01234 56789 01234 56789 01234 56789 LUSIT FSATM TZJIZ SYDZM PMFIZ REWLR ZEKLS RQXCA TFENE YBVOI AUXAR MESCI TOYEN SLAVI CTOIR EALLE MANDE ENTRA INELA NNEXI LAVIC TOIRE ALLEM ANDEE NTRAI NELAN NEXIO NDELA LSACE LORRA
Now it's certain that we have an autokey cipher and the key is »Aux armes, citoyens«—a line from the »Marseillaise«. Using the key we easily decipher the complete plaintext:
La victoire allemande entraîne l'annexion de l'Alsace-Lorraine et l'affirmation de la puissance allemande en Europe au détriment de l'Autriche-Hongrie et de la France.
[Consequences of the German victory are the annexation of Alsace-Lorraine and the affirmation of the German power at the expense of Austria-Hungary and France.]