This is how we can use hashing to solve #Petr :
First how to hash all sub-strings of S of length n in O(n) :
Let the string be "abcdefgh"
We will do cumulative hashing as following :
Lets take a prime number p;
(here a , b , c , ... , h are ascii values of 'a' , 'b' ,'c' , .. ,'h' )
hash = a;
hash = a*p + b;
hash = a*p^2 + b*p + c;
hash = a*p^3 + b*p^2 + c*p + d;
like-wise up-to hash = a*p^7 + b*p^6 + ... + g*p + h;
So, you must have figured out the recursive relation for finding hash[x].
hash[x] = hash[x-1]*p + s[i].
(base case: hash = s )
Now as we are done with finding cumulative hash value. We need to find the hash value of any substring.
Lets see this thing:
Let there is a string "dabctabc" ,
now "abc" at 1 and "abc" at 5 are same strings , and we must wish to have same hash value for both of them . We propose that every hashed value should be of form :
(for a substring of s from i to j inclusive.)
n = length of the whole string .
s[i]*p^n + s[i+1]*p^(n-1) + .. + s[j] *p^(n-length_substring);
Can you feel/see that if we maintain this proposition above , "abc" at 1 and "abc" at 5 will have same hash values .
(make sure we understand this before moving further)
For maintaining above proposition , we maintain an array: step
step[x] = step[x-1]*p ;
(base case: step = 1)
Now from hash and step we can calculate hash values of each of the substrings.
Lets remind us we are talking about the string "dabctabc" :
For any substring starting at i and ending at j (inclusive) :
for "abc" at 1:
substring_length = 3.
so, hash_value should according to our proposition should be:
a*p^7 + b*p^6 + c*p^5 .
hash = d*p^3 + a*p^2 + b*p + c;
hash = d;
so our required value will be : (hash - hash*p^3)*(p^(8-3))
i.e (hash - hash*step[substring_length])*step[n - length_of_substring]
So, generalised formula for any substring starting at i and ending at j (inclusive) :
(hash[j] - hash[i]*step[substring_length])*step[n - substring_length] )
Now use the above formula for calculating hash value of "abc" at 5.
We will see that this "abc" also hash the same value that is "a*p^7 + b*p^6 + c*p^5".
Our job is almost done . :)