oobxw9zf0lm7ez's blog

By oobxw9zf0lm7ez, 3 years ago, In English

Hello community!

How can we find no. of occurences of a given substring t of string s in the string s? There are two parts of the problem :

  1. non-overlapping -> for eg. aa in aaa, Ans: 1

  2. overlapping -> Ans: 2

I know many good people here know this.

Thanks in advance.

  • Vote: I like it
  • +1
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
Rev. 3   Vote: I like it +1 Vote: I do not like it

we can do sliding window of length of substring t on our string 's' , we will store the start index for each substring which is equal to 't' , total count of such strings is our overlapping ones and for non overlapping we will see if the difference between start index is >= length of t .

code

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Use KMP algorithm to find starting index of all occurences of given substring in original string.

Further, u can divide them in overlapping/non-overlapping based on difference between 2 consecutive starting points.

Overall Time Complexity : O(N)

Check out this for understanding KMP Algorithm:

  1. KMP
»
3 years ago, # |
  Vote: I like it +8 Vote: I do not like it

A probably simpler solution using the z-function is as follows:

The z-function of the string $$$a[1..n]$$$ is defined as an array $$$z[1..n]$$$ such that $$$z[i]$$$ is the length of the maximum common prefix of the string $$$a[1..n]$$$ and $$$a[i..n]$$$. This can be constructed in $$$O(n)$$$ using an algorithm similar to what is described here.

Consider the string $$$t + \Phi + s$$$ where $$$ \Phi $$$ is a character that doesn't appear in either of $$$s$$$ or $$$t$$$. Note that the z-function of this array for characters after $$$\Phi$$$ will give the longest prefix of $$$t$$$ that matches a string starting from that character in $$$s$$$, so all we need to do is to count the number of indices after the index of $$$\Phi$$$ where the value of the z-function equals the length of $$$t$$$.