### mkagenius's blog

By mkagenius, 9 years ago,

This is how we can use hashing to solve #Petr :

First how to hash all sub-strings of S of length n in O(n)  :

Let the string be  "abcdefgh"

We will do cumulative hashing as following :
Lets take a prime number p;
(here a , b , c , ... , h are ascii values of 'a' , 'b' ,'c' , .. ,'h' )
hash[0] = a;
hash[1] = a*p + b;
hash[2] = a*p^2 + b*p + c;
hash[3] = a*p^3 + b*p^2 + c*p + d;
like-wise up-to hash[7] = a*p^7 + b*p^6 + ... + g*p + h;

So, you must have figured out the recursive relation for finding hash[x].
hash[x] = hash[x-1]*p + s[i].
(base case: hash[0] = s[0] )

Now as we are done with finding cumulative hash value. We need to find the hash value of any substring.
Lets see this thing:
Let there is a string "dabctabc" ,
now "abc" at 1 and "abc" at 5 are same strings , and we must wish to have same hash value for both of them . We propose that every hashed value should be of form :
(for a substring of s from i to j inclusive.)
n = length of the whole string .
s[i]*p^n + s[i+1]*p^(n-1) + .. + s[j] *p^(n-length_substring);

Can you feel/see that if we maintain this proposition above , "abc" at 1 and "abc" at 5 will have same hash values .
(make sure we understand this before moving further)

For maintaining above proposition  , we maintain an array: step[]
step[x] = step[x-1]*p ;
(base case: step[0] = 1)

Now from hash[] and step[] we can calculate hash values of each of the substrings.
Lets remind us we are talking about the string "dabctabc" :
For any substring starting at i and ending at j (inclusive) :
for "abc" at 1:
substring_length = 3.
so, hash_value should according to our proposition should be:
a*p^7 + b*p^6 + c*p^5 .
See,
hash[3]  = d*p^3 + a*p^2 + b*p + c;
hash[0]  = d;
so our required value will be : (hash[3] - hash[0]*p^3)*(p^(8-3))
i.e (hash[3] - hash[0]*step[substring_length])*step[n - length_of_substring]

So, generalised formula for any substring starting at i and ending at j (inclusive) :
(hash[j] - hash[i]*step[substring_length])*step[n - substring_length] )

Now use the above formula for calculating hash value of "abc" at 5.
We will see that this "abc" also hash the same value that is "a*p^7 + b*p^6 + c*p^5".

Our job is almost done . :)

• 0