[JoGu]

Cryptology

The Phi Distribution for English Texts

a7Hzq .#5r<
kÜ\as TâÆK$
ûj(Ö2 ñw%h:
Úk{4R f~`z8
¤˜Æ+Ô „&¢Dø

For empirically determining the distribution of the inner coincidence index φ(a) for English texts (or text chunks) a, we again take a large English text—in this case the book The Fighting Chance by Robert W. Chambers from Project Gutenberg—and chop it into chunks a, b, c, d, ... of 100 letters each. Then we count φ(a), φ(b), ... and list the values in the first column of a spreadsheet. The text has 602536 letters. Here is the cleaned text.

We take the first 262006 letters and consider the first 2000 pieces of 100 letters each. The figure and table show some characteristics of the distribution.

To get φ-values divide x-values in the graphic by 4950.
[Frequency of phi values]

Distribution of φ for 2000 English texts of 100 letters

Minimum: 0.0481
Median: 0.0634Mean value: 0.0639
Maximum: 0.0913Standard dev:0.0063
1st quartile:0.05945% quantile: 0.0549
3rd quartile:0.067795% quantile:0.0750

Author: Klaus Pommerening, 2013-Dec-20; last change: 2014-Jan-23.