[JoGu]

Cryptology

Pattern Search with Perl

a7Hzq .#5r<
kÜ\as TâÆK$
ûj(Ö2 ñw%h:
Úk{4R f~`z8
¤˜Æ+Ô „&¢Dø

Introduction: Numerical Patterns of Strings

We want to describe letter patterns by numbers. Here is an example: The word statistics defines the pattern 1232412451.

For the mathematical background of the following algorithm see here.


Algorithm

Goal: Determine the numerical pattern of a string.

Input: String as list string = (a1,...,aq).

Output: Numerical pattern as list pattern = (n1,...,nq).
   Initial value: empty list.

Auxiliary variables:

Procedure:
   Loop over the letters of string; current letter is x:
      If there is an i with x = assoc[i]
         then append i to pattern;
      else:
         increment n,
         append n to pattern;
         append x to assoc.

Here is a Perl program that implements this algorithm. It only serves for understanding the following more complex program.


Building a Regular Search Expression for a Pattern

The most efficient approach to pattern searching with Perl uses so called regular expressions. Here is the mninimum explanation for the following program:

Search expressionEffect

/./ fits the first character
(if the string is not void).
/(.)/ fits the first character
and assigns it to the variable $1.

/../ fits the first two characters.
/(..)/ fits the first two characters
and assigns them as string of length 2 to the variable $1.
/(.)(.)/ fits the first two characters
and assigns them as strings of length 1 to the variables $1 and $2.

/(.)\1/ fits, if the first character is arbitrary, and the second character is identical with the first.
[The first character is assigned to the variable $1; this is denoted as \1 inside of the search expression.]
/(.)(?!\\1)/ fits the first two characters, if the second is unequal to the first,
and assigns them as strings of length 1 to the variables $1 and $2.

/(.)(.)\1(.)\1/ pattern 12131, if different numbers don't need to correspond to different characters.
/(.)(?!\\1)\1(?!\\1|\\2)\1/ pattern 12131, if different numbers necessarily denote different characters.


Pattern Search Program

The complete program, written in the typical, highly condensed style usual with Perl, is here.

You may call it online from a web form and search in a dictionary or a text of your choice.

Exercise. Try to understand this program.

Exercise. Try the programm.


Author: Klaus Pommerening, 1999-Oct-29; last change: 2013-Oct-11.