LovesProgramming's blog

By LovesProgramming, history, 9 months ago, In English,

You are given a string of length 'n'.

If a string contains at least one character whose frequency is greater than or equal to the half of the length of the string, then the string is called superior.

You are required to find the length of the longest superior substring available in the given string.

Note: Here half is considered under integer division i.e. 5/2=2 , etc.

The string contains only lowercase English alphabets.

I have tried O(n^2) solution(s) with some optimization(s) which obviously got timed out. I realized that there are better:- O(nlogn) or O(n) approaches,can anybody explain them?

Link to the problem statement :- https://www.hackerearth.com/practice/algorithms/searching/binary-search/practice-problems/algorithm/superior-substring-dec-circuits-e51b3c27/

Thanks :-)

 
 
 
 
  • Vote: I like it
  • -6
  • Vote: I do not like it

»
9 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by LovesProgramming (previous revision, new revision, compare).

»
9 months ago, # |
  Vote: I like it -10 Vote: I do not like it

This should be solvable using a greedy solution using two pointer method. Take the largest substring from the left where there is a letter that makes it superior. As soon as this isn't true anymore, take another substring from the very next index and continue.

»
9 months ago, # |
Rev. 2   Vote: I like it +1 Vote: I do not like it

Deleted

  • »
    »
    9 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Can you explain a bit more on how to actually proceed? Thanks!:)

  • »
    »
    9 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Your solution is not correct. The function of the length of the superior substring is not monotonic. For example: aabcdeaa , in this string there is a superior substring of length 8 but not of length 6.

    • »
      »
      »
      9 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Whoops, thanks! Do you have any ideas?

      • »
        »
        »
        »
        9 months ago, # ^ |
        Rev. 2   Vote: I like it +11 Vote: I do not like it

        For every alphabet, you can find the length of the largest substring in which it contributes as a dominant character. That can be done by building a prefix sum over that alphabet and for every index P[i] find a j < i such that P[i] — P[j] > (j — i) / 2. For more details, you can refer to the editorial.

        • »
          »
          »
          »
          »
          9 months ago, # ^ |
            Vote: I like it -8 Vote: I do not like it

          Can't we perform binary search to check if we're able to achieve a superior string of given length, in O(n), for range low = 1, high = length of string.

          • »
            »
            »
            »
            »
            »
            9 months ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            No

            • »
              »
              »
              »
              »
              »
              »
              9 months ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              And why so??

              • »
                »
                »
                »
                »
                »
                »
                »
                9 months ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                See, in this problem, if a superior susbtring is of length '10', then it is not mandatory that a superior substring of length-'9'/'8'/etc. also exists so binary search cannot be applied and it will never yield correct answer :-)

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 months ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  Yeah, I got it. Thanks, btw, tag was given Binary search, so all of a sudden, mind went there.

  • »
    »
    9 months ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Solution for "abcda" is 5 but no superior substring of length 4 exist.

»
9 months ago, # |
Rev. 4   Vote: I like it 0 Vote: I do not like it

If half was defined as actual half then there is a nice way to solve it. Remember there was a blog about something close to this a while back but now I can't seem to find it. Will use python notation to explain my idea.

Let be defined as

sum(1 for c in S[:i] if c=='a')

which is the number of 'a' in the string at indices  < i. For a substring S[a:b], a < b, to contain half or more 'a's then

.

which is the same as

.

This inequality basically solves everything, just go through all indices sorted by while keeping track of the smallest index you've seen and you're basically done. Here is some pseudo-code:

a = n
longest_substring = 0
for b in sorted(range(n+1),key=lambda i:(2*A[i]-i,i)):
  if b-a>longest_substring:
    longest_substring = b-a
  a = min(a,b)

with this you can also do other fun stuff like calculating number of longest superior substrings and so on. The running time should be alphabet size times . It is possible to remove the using bucket sort, and it might also be possible to remove the scaling with the alphabet size by some modifications to the algorithm.

  • »
    »
    9 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thankyou so much for the detailed solution. I recently got AC for this and my algorithm was O(n*(logn)^2) . I create 2 array(s) based on that above formula,then for each indice of array-'1' , I find the part of array in which I have to find max-distance/length. I used binary search as well as segment trees to resolve my query and hence O(n*(logn)^2) :-) Thankyou again!:)

»
9 months ago, # |
  Vote: I like it +26 Vote: I do not like it

Iterate over a letter that will be most frequent, and change the string into a sequence where every occurrence of the chosen letter is +1, and every other letter is -1. Then you're looking for the longest substring with a positive sum of values. This can be solved in O(n), so the whole problem is solvable in O(n·26).

  • »
    »
    9 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    Nice Idea, but a minor change is required This problem becomes a little different because of integer division definition. Lets say the string is: abc. Focusing on 'a' we get , 1 -1 -1 and the longest subarray with positive/0 sum is of length-2(0-1) but actually, the answer to this case is '3' cuz 3/2=1 which is the frequency of 'a' , so we have to find the longest subarray whose sum is >= -1 , thanks for the idea though :)

    • »
      »
      »
      9 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Ok, that's a strange rounding definition in the problem. I agree it should be >= -1 in this case.

      • »
        »
        »
        »
        9 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I did what you mentioned but it got timed-out on exactly 1 test out of all the tests : D, any ideas on further optimizations?

        And yea, on a side-note, today's stream was awesome, thanks for so much contribution ! :-)

        • »
          »
          »
          »
          »
          9 months ago, # ^ |
          Rev. 2   Vote: I like it 0 Vote: I do not like it

          Show us your code. Put it in "spoiler" tags or paste it to pastebin or ideone.

          And I'm glad you liked the stream :)

          • »
            »
            »
            »
            »
            »
            9 months ago, # ^ |
            Rev. 2   Vote: I like it 0 Vote: I do not like it

            ****Spoiler*****

            The solution which passes all the tests except one is here:- https://ideone.com/wvpXju

            Also, I created an O(nlogn) solution for finding the longest subarray with sum>=-1 . I feel maybe the time limits are too tight?

            Thanks :-)

            • »
              »
              »
              »
              »
              »
              »
              9 months ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              You can find the longest the longest substring with sum >= -1 in O(N).

              For this you can just keep a record of first time you had any prefix sum.
              Then largest substring ending at the index i with sum >= -1 would be
              i, if prefix[i] >= -1
              i - (minimum index having sum (prefix[i]+1)) + 1, if prefix[i] < -1

              • »
                »
                »
                »
                »
                »
                »
                »
                9 months ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                That does not happen in O(n) , especially, finding the minimum index that has that sum for every index of 'i' . Thanks, anyways :-)

              • »
                »
                »
                »
                »
                »
                »
                »
                9 months ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                If possible, can you give me the exact code for finding the longest sub-array with sum>=-1, so I can run it for all the tests? :-) Thanks. in. advance.

            • »
              »
              »
              »
              »
              »
              »
              9 months ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              As I said, that part can be solved in O(n). Values of prefix sums are from interval [ - n, n] only, so you can sort them in O(n).

              One approach is to compute x[value] as the shortest prefix with a sum equal to value. Then you must go through the sequence again and for every prefix you want to know: if this prefix sum is A, what is the shortest prefix with sum A - 1 or smaller? You can know this by computing prefix minima on the array x[].