We want to test a pretended period l whether it is the real period. We write the text in rows of width l and consider the columns.
Maybe the columns are quite short, thus their coincidence indices are diffuse and give no clear impression. However we can put all the indices together without bothering about the different monoalphabets, and get a much more precise value, based on all the letters of the text.
Definition. The SINKOV statistic φl(a) of order l of a text a is the mean value of the coincidence indices over all l columns if the text is written in rows of width l.
Here is a Perl program that calculates the SINKOV statistics.
Let us again examine the ciphertext from Section 3. We get the values:
φ1(a) | 0.0442 | φ7(a) | 0.0829 | φ13(a) | 0.0444 | ||
φ2(a) | 0.0439 | φ8(a) | 0.0443 | φ14(a) | 0.0839 | ||
φ3(a) | 0.0440 | φ9(a) | 0.0427 | φ15(a) | 0.0432 | ||
φ4(a) | 0.0438 | φ10(a) | 0.0421 | φ16(a) | 0.0439 | ||
φ5(a) | 0.0430 | φ11(a) | 0.0426 | φ17(a) | 0.0444 | ||
φ6(a) | 0.0435 | φ12(a) | 0.0432 | φ18(a) | 0.0419 |
The period 7 is overwhelmingly evident. The values other than at the multiples of 7 are in almost perfect compliance with a (German) ciphertext of period around 7.
Our example ciphertext was quite long, and it is no surprise that the statistical methods perform very well. To get a more realistic picture let us examine a ciphertext of length 148.