[JoGu]

Cryptology

Recognizing Plaintext: SINKOV's Log-Weight Test

a7Hzq .#5r<
kÜ\as TâÆK$
ûj(Ö2 ñw%h:
Úk{4R f~`z8
¤˜Æ+Ô „&¢Dø

Definition

The MFL-test is simple and efficient. SINKOV proposed a more refined test that uses the information given by all single letter frequencies, not just by separating the letters into two classes.

Each letter is given a log-weight depending on the language. More frequent letters get a higher weight.

A rationale for the weights is given in the mathematical version of this section. The weights for English, German, and French are given in the following table:

Log-Weights
LetterEnglishGermanFrench
A1.91.81.9
B1.21.31.0
C1.41.41.5
D1.61.71.6
E2.12.22.2
F1.31.21.1
G1.31.51.0
H1.81.60.8
I1.81.91.8
J0.30.50.5
K0.91.20.0
L1.61.51.8
M1.41.41.4
N1.82.01.9
O1.91.51.7
P1.31.01.4
Q0.00.01.0
R1.81.91.8
S1.81.81.9
T1.91.81.9
U1.41.61.8
V1.01.01.2
W1.41.20.0
X0.00.00.6
Y1.30.00.3
Z0.01.00.0

To calculate the Log-Weight (or LW) score of a string simply go through the letters of the string and add the log-weights. The string LETTER has the (English) log-weight

1.6 + 2.1 + 1.9 + 1.9 + 2.1 + 1.8 = 11.4

The Perl script LWscore.pl calculates LW scores for lists of strings.


The CAESAR Example

This are the LW scores for the Caesar example (with English weights):

       FDHVDU  8.7     OMQEMD  8.4       XVZNVM  5.2
       GEIWEV  9.7     PNRFNE 10.1 <---  YWAOWN  9.7
       HFJXFW  6.1     QOSGOF  8.2       ZXBPXO  4.4
       IGKYGX  6.6     RPTHPG  9.4       AYCQYP  7.2
       JHLZHY  6.8     SQUIQH  6.8       BZDRZQ  4.6
       KIMAIZ  7.8     TRVJRI  8.6       CAESAR 10.9 <===
       LJNBJA  7.1     USWKSJ  7.6       DBFTBS  9.0
       MKOCKB  7.7     VTXLTK  7.3       ECGUCT  9.5
       NLPDLC  9.3     WUYMUL  8.5

The correct solution stands out clearly, the order of the non-solutions is somewhat permuted compared with the MFL score.


Application of LW Scores to the Cryptanalysis of the BELASO Cipher

For the period-4 example the LW scores are as follows:

UDHWHUPLSLWD 18.7   DMQFQDYUBUFM 13.9   MVZOZMHDKDOV 14.5
VEIXIVQMTMXE 14.5   ENRGREZVCVGN 17.4   NWAPANIELEPW 20.4 <--
WFJYJWRNUNYF 15.4   FOSHSFAWDWHO 19.9   OXBQBOJFMFQX 10.5
XGKZKXSOVOZG 11.0   GPTITGBXEXIP 15.9   PYCRCPKGNGRY 16.9
YHLALYTPWPAH 19.1   HQUJUHCYFYJQ 12.3   QZDSDQLHOHSZ 13.9
ZIMBMZUQXQBI 10.2   IRVKVIDZGZKR 13.9   RAETERMIPITA 21.7 <==
AJNCNAVRYRCJ 16.7   JSWLWJEAHALS 17.9   SBFUFSNJQJUB 13.8
BKODOBWSZSDK 16.2   KTXMXKFBIBMT 13.9   TCGVGTOKRKVC 16.7
CLPEPCXTATEL 18.5   LUYNYLGCJCNU 16.6
MBTWZWIBWJWL 15.0   VKCFIFRKFSFU 16.2   ETLOROATOBOD 21.6 <==
NCUXAXJCXKXM 10.5   WLDGJGSLGTGV 16.4   FUMPSPBUPCPE 17.2
ODVYBYKDYLYN 16.8   XMEHKHTMHUHW 17.7   GVNQTQCVQDQF 11.3
PEWZCZLEZMZO 13.2   YNFILIUNIVIX 17.4   HWORURDWRERG 20.1 <--
QFXADAMFANAP 16.3   ZOGJMJVOJWJY 11.4   IXPSVSEXSFSH 16.5
RGYBEBNGBOBQ 16.3   APHKNKWPKXKZ 13.1   JYQTWTFYTGTI 16.3
SHZCFCOHCPCR 17.3   BQILOLXQLYLA 14.5   KZRUXUGZUHUJ 11.7
TIADGDPIDQDS 18.2   CRJMPMYRMZMB 14.7   LASVYVHAVIVK 17.0
UJBEHEQJERET 17.1   DSKNQNZSNANC 16.6
HLSJWJCAKDJ 13.3    QUBSFSLJTMS 14.5   ZDKBOBUSCVB 13.6
IMTKXKDBLEK 14.3    RVCTGTMKUNT 16.7   AELCPCVTDWC 17.0
JNULYLECMFL 15.8    SWDUHUNLVOU 17.1   BFMDQDWUEXD 13.6
KOVMZMFDNGM 14.0    TXEVIVOMWPV 14.8   CGNEREXVFYE 16.2
LPWNANGEOHN 18.7 <- UYFWJWPNXQW 11.6   DHOFSFYWGZF 15.0
MQXOBOHFPIO 14.5    VZGXKXQOYRX  8.2   EIPGTGZXHAG 14.7
NRYPCPIGQJP 13.6    WAHYLYRPZSY 15.5   FJQHUHAYIBH 14.6
OSZQDQJHRKQ 10.1    XBIZMZSQATZ 10.0   GKRIVIBZJCI 13.3
PTARERKISLR 18.7 <- YCJANATRBUA 16.8
ORCNBCOWCOO 18.0    XALWKLXFLXX 10.3    GJUFTUGOUGG 14.8
PSDOCDPXDPP 15.1    YBMXLMYGMYY 13.5    HKVGUVHPVHH 15.1
QTEPDEQYEQQ 12.4    ZCNYMNZHNZZ 11.3    ILWHVWIQWII 15.8
RUFQEFRZFRR 14.6    ADOZNOAIOAA 18.5    JMXIWXJRXJJ  7.6
SVGRFGSAGSS 17.1    BEPAOPBJPBB 14.9    KNYJXYKSYKK 11.4
TWHSGHTBHTT 18.7 <- CFQBPQCKQCC 10.3    LOZKYZLTZLL 12.4
UXITHIUCIUU 16.1    DGRCQRDLRDD 16.1    MPALZAMUAMM 15.6
VYJUIJVDJVV 11.0    EHSDRSEMSEE 20.4 <= NQBMABNVBNN 15.1
WZKVJKWEKWW 11.7    FITESTFNTFF 18.4

The method picks the correct solution except for column 3 where the top score occurs twice.


Summary

In summary the examples show no clear advantage of the LW-method over the MFL-method, notwithstanding the higher granularity of the information used to compute the scores.

Friedman's MFL-method has the advantage of breezing evaluation by hand.


Author: Klaus Pommerening, 2014-Jun-10; last change: 2014-Jun-10.