Блог пользователя oobxw9zf0lm7ez

Автор oobxw9zf0lm7ez, 3 года назад, По-английски

Hello community!

How can we find no. of occurences of a given substring t of string s in the string s? There are two parts of the problem :

  1. non-overlapping -> for eg. aa in aaa, Ans: 1

  2. overlapping -> Ans: 2

I know many good people here know this.

Thanks in advance.

  • Проголосовать: нравится
  • +1
  • Проголосовать: не нравится

»
3 года назад, # |
Rev. 3   Проголосовать: нравится +1 Проголосовать: не нравится

we can do sliding window of length of substring t on our string 's' , we will store the start index for each substring which is equal to 't' , total count of such strings is our overlapping ones and for non overlapping we will see if the difference between start index is >= length of t .

code

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Use KMP algorithm to find starting index of all occurences of given substring in original string.

Further, u can divide them in overlapping/non-overlapping based on difference between 2 consecutive starting points.

Overall Time Complexity : O(N)

Check out this for understanding KMP Algorithm:

  1. KMP
»
3 года назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

A probably simpler solution using the z-function is as follows:

The z-function of the string $$$a[1..n]$$$ is defined as an array $$$z[1..n]$$$ such that $$$z[i]$$$ is the length of the maximum common prefix of the string $$$a[1..n]$$$ and $$$a[i..n]$$$. This can be constructed in $$$O(n)$$$ using an algorithm similar to what is described here.

Consider the string $$$t + \Phi + s$$$ where $$$ \Phi $$$ is a character that doesn't appear in either of $$$s$$$ or $$$t$$$. Note that the z-function of this array for characters after $$$\Phi$$$ will give the longest prefix of $$$t$$$ that matches a string starting from that character in $$$s$$$, so all we need to do is to count the number of indices after the index of $$$\Phi$$$ where the value of the z-function equals the length of $$$t$$$.