Hi can someone share the approach to find lexicographically largest substring in a string?

I know the answer will always be one of the suffixes but cannot find a way to analyze those suffixes efficiently.

Constraints |S| < 1e5

I need an O(n) approach or O(nlogn) without suffix array.

Any ideas are welcome.

I found this link on Leetcode but cannot understand it.

If a substring is the largest lexicographically, then it must start with the largest element from the string.

Observation: If you add elements to the end of a string it makes it larger lexicographically.

Therefore, the answer shold be the substring from the first occurence of the biggest character to the end of the string.

test case :- aazaaaazz

your output ?

Test case ;- zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz

What he is proposing is probably O(n*n) I think.

or zz only?

Consider this case : I/p : cacacb

According to you the o/p would be "cacacb", but the expected o/p is "cb".

Forgive me if I say anything wrong.

As pointed out,the above solution is indeed incorrect.

`cacacb`

would return`cacacb`

based on that solution. The actual solution would be`cccb`

. It's clear that between the`c`

characters, there are other characters that are smaller than`c`

(like 'a'). We need to get rid of that.I would maintain a deque. for each character

`char`

in the given string, we pop the last element of the deque out, until`deque.back >= char`

, then push`char`

to the back. So, the characters that are in the deque are as big as it can get and the string is as long as possible.Time and space complexities are both armortized O(n). Space complexity isn't the most optimized (as I can see an O(1) space complexity solution down there), but this one is easier to figure out and easier to code imho.

similar problem in one of atcoder round

Can you give a link?

I tried the problem.

I also read the explaination. Observation that optimum would be a suffix is good. So the optimum always ends at end of string, now he presents following method to get the start of optimum string.

The author is maintaining two pointers i & j and he maintains a invariant that :

`optimum cannot start between i & j also it cannot start before i.`

.........i.......j ______(unprocessed string) here he claims that optimum cannot start positions of dots (.)

He continues increasing j and i while maintaining above invariant. Now consider when j pointer reaches beyond end.

......i.............(end) j

According to invariant optimum cannot start from any positions of dot(.) so it must start from i. So, the answer is s[i:].

I hope it helps you understand the proof in the link.

For reference I attach my recursive solution of which I cannot reason the time complexity. In this solution I precalculate the number of consecutive same character starting from a position in array cons. My submission

Your solution is not visible

Here you go

This is my submission using maps and substring. It's not very efficient but works. Time O(nlogn) storage O(n). This uses maps and substrings. The idea is to get the substring which is the largest. We only consider substrings until the next occurrence of the largest character.

The case where the largest character is in succession is also handled by only considering the first of those indices since it going to always larger than the one immediately next to it.

Code

Although it's not efficient as the two pointer solution, but it is easy to implement and probably appropriate for someone having difficulty in the question

Two Pointer solution. This is more efficient. Time complexity O(n). Space complexity- O(1) normally, O(n) in the worst case

you can solve it by using hash + binary search, O(n*log(n)), because you need to find maximum of n strings(suffixes), so you can do n comparisons(by using binary search you can find maximum equal prefix, and then compare next letter)

One of the solutions, not the optimal and not easy one is to just create suffix array and take last value as position in suffix array