The Z-algorithm finds occurrences of a βwordβ W
within a main βtext stringβ T in linear time O(|W| + |T|).
Given a string S of length n, the algorithm produces
an array, Z where Z[i] represents the longest substring
starting from S[i] which is also a prefix of S. Finding
Z for the string obtained by concatenating the word, W
with a nonce character, say $ followed by the text, T,
helps with pattern matching, for if there is some index i
such that Z[i] equals the pattern length, then the pattern
must be present at that point.
While the Z array can be computed with two nested loops in O(|W| * |T|) time, the
following strategy shows how to obtain it in linear time, based
on the idea that as we iterate over the letters in the string
(index i from 1 to nβ-β1), we maintain an interval [L,βR]
which is the interval with maximum R such that 1ββ€βLββ€βiββ€βR
and S[L...R] is a prefix that is also a substring (if no such
interval exists, just let Lβ=βRβ=ββ-β1). For iβ=β1, we can
simply compute L and R by comparing S[0...] to S[1...].
Example of Z array
Index 0 1 2 3 4 5 6 7 8 9 10 11
Text a a b c a a b x a a a z
Z values X 1 0 0 3 1 0 0 2 2 1 0
Other examples
str = a a a a a a
Z[] = x 5 4 3 2 1
str = a a b a a c d
Z[] = x 1 0 2 1 0 0
str = a b a b a b a b
Z[] = x 0 6 0 4 0 2 0
Example of Z box

O(|W| + |T|)O(|W|)