Mikemirzyanov1's blog

By Mikemirzyanov1, history, 5 months ago, In English

Hi can someone share the approach to find lexicographically largest substring in a string?

I know the answer will always be one of the suffixes but cannot find a way to analyze those suffixes efficiently.

Constraints |S| < 1e5

I need an O(n) approach or O(nlogn) without suffix array.

Any ideas are welcome.

[Solution](https://leetcode.com/problems/last-substring-in-lexicographical-order/discuss/363662/Short-python-code-O(n)-time-and-O(1)-space-with-proof-and-visualization)

I found this link on Leetcode but cannot understand it.

 
 
 
 
  • Vote: I like it
  • +3
  • Vote: I do not like it

»
5 months ago, # |
  Vote: I like it -24 Vote: I do not like it

If a substring is the largest lexicographically, then it must start with the largest element from the string.

Observation: If you add elements to the end of a string it makes it larger lexicographically.

Therefore, the answer shold be the substring from the first occurence of the biggest character to the end of the string.

  • »
    »
    5 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    test case :- aazaaaazz

    your output ?

  • »
    »
    4 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Consider this case : I/p : cacacb

    According to you the o/p would be "cacacb", but the expected o/p is "cb".

  • »
    »
    4 weeks ago, # ^ |
    Rev. 4   Vote: I like it 0 Vote: I do not like it

    Forgive me if I say anything wrong.

    As pointed out,the above solution is indeed incorrect. cacacb would return cacacb based on that solution. The actual solution would be cccb. It's clear that between the c characters, there are other characters that are smaller than c (like 'a'). We need to get rid of that.

    I would maintain a deque. for each character char in the given string, we pop the last element of the deque out, until deque.back >= char, then push char to the back. So, the characters that are in the deque are as big as it can get and the string is as long as possible.

    Time and space complexities are both armortized O(n). Space complexity isn't the most optimized (as I can see an O(1) space complexity solution down there), but this one is easier to figure out and easier to code imho.

»
5 months ago, # |
  Vote: I like it 0 Vote: I do not like it

similar problem in one of atcoder round

»
5 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I tried the problem.

I also read the explaination. Observation that optimum would be a suffix is good. So the optimum always ends at end of string, now he presents following method to get the start of optimum string.

The author is maintaining two pointers i & j and he maintains a invariant that :

optimum cannot start between i & j also it cannot start before i.

.........i.......j ______(unprocessed string) here he claims that optimum cannot start positions of dots (.)

He continues increasing j and i while maintaining above invariant. Now consider when j pointer reaches beyond end.

......i.............(end) j

According to invariant optimum cannot start from any positions of dot(.) so it must start from i. So, the answer is s[i:].

I hope it helps you understand the proof in the link.

For reference I attach my recursive solution of which I cannot reason the time complexity. In this solution I precalculate the number of consecutive same character starting from a position in array cons. My submission

»
4 weeks ago, # |
Rev. 3   Vote: I like it -8 Vote: I do not like it

This is my submission using maps and substring. It's not very efficient but works. Time O(nlogn) storage O(n). This uses maps and substrings. The idea is to get the substring which is the largest. We only consider substrings until the next occurrence of the largest character.

The case where the largest character is in succession is also handled by only considering the first of those indices since it going to always larger than the one immediately next to it.

Code

Although it's not efficient as the two pointer solution, but it is easy to implement and probably appropriate for someone having difficulty in the question

»
4 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

you can solve it by using hash + binary search, O(n*log(n)), because you need to find maximum of n strings(suffixes), so you can do n comparisons(by using binary search you can find maximum equal prefix, and then compare next letter)

»
4 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

One of the solutions, not the optimal and not easy one is to just create suffix array and take last value as position in suffix array