How to find the maximum number of non-overlapping occurances of a given substring.

Please subscribe to the official Codeforces channel in Telegram via the link https://t.me/codeforces_official. ×

→ Pay attention

Before contest
Codeforces Round (Div. 2)
45:16:42
Register now »

*has extra registration

→ Streams

Codeforces Round 960 Solution Discussion

By aryanc403

Before stream 47:26:41

View all →

→ Top rated

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	157
4	maroonrk	154
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	TheScrasse	145
9	nor	145

View all →

→ Find user

→ Recent actions

Detailed →

chirag_h's blog

How to find the maximum number of non-overlapping occurances of a given substring.

By chirag_h, history, 3 years ago, In English

I was trying this problem Taklu Kuddus.

The problem gives us a string S and a pattern P and for Q queries we have to find the maximum number of non-overlapping occurrences of P in the substring of S of the given range in q.

I first tried brute force which got TLE. I tried playing with the starting and ending indexes of matched substrings but could not think of a clear strategy.

Any leads on how to approach this?

#string matching, #string

chirag_h
3 years ago
5

Comments (5)

Write comment?

Tet

3 years ago, # |

Your task is similar to Consistent occurrences [gym]. My solution for it is greedily picking a sub-string whenever possible...

→ Reply

brezhart

3 years ago, # |

Yes, solution is greedy, pick whenever you can pick. Proof:

Assume its not true, and you can get better score after skipping one occurrence. This mean that there is at least 2 non overlapping starts of occurrences inside of first occurrence. And that simply cant be true, because its mean that second occurrence has smaller length than first, but its false because its all occurrences of the same string

→ Reply

chirag_h

3 years ago, # |

← Rev. 3 →

Tet brezhart Thanks for replying.

As you guys suggested, I tried using greedy with rolling-hash but still got TLE. The constraints seem to be too high for an O(Q. N) algorithm. Any hint on how to improve this? Here is the code for reference: Code

→ Reply

jeroenodb

3 years ago, # |

← Rev. 2 →

When there's a working greedy strategy to a problem on arrays/string, (always pick the first occurrence), we can use binary jumping to answer queries.

Let jmp[i] = the index after the end of the first occurrence of P in S, such that P starts at index i or later.

With, for example rolling hashes, these jump pointers can be calculated from right to left in O(n) preprocessing. Then we calculate all jumps of length 2^k, for a total of O(n log(n)) preprocessing (like in binary lifting in trees). When we want to answer a query on substring [l,r]. We start at l, and do a binary search with the jump pointers to find the amount of jumps you can make to still be on the left of index r. This is again very similar to finding the LCA with binary lifting in a tree. So you can solve the problem in O((n+q) log(n)). If this TLEs, we could use Binary lifting in linear time preprocessing, to get O(n + qlog(n))

Edit: Because the statement says: It is guaranted that the summation of all the queries for each test case will not exceed 200000. After we have the jump pointers we don't need any binary jumping, and can just follow the pointers brute force to get O(n+ q + 200000).

→ Reply

chirag_h

3 years ago, # ^ |

+19

Thank you very much! I think I get what you're saying, Ill try implementing it!

→ Reply