tanmaydatta's blog

By tanmaydatta, history, 7 years ago, In English

I recently gave a hiring test and was stuck on a particular problem. The problem is as follows:

You are given an array A of N elements and Q queries of the form L,R.

For every query find the first missing positive integer in the range A[L],A[L+1],A[L+2] ...... A[R].

N <= 10^6

Q <= 10^6

1<= A[i] <= 10^9

I know how to find the first missing positive in O(N) but don't know how to handle multiple queries in less time. Any help would be appreciated. Thanks

  • Vote: I like it
  • +47
  • Vote: I do not like it

»
7 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

EDIT: misunderstood problem, ignore

  • »
    »
    7 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    "each node storing info about the leftmost non-positive integer in range" How would you combine the results? (combining results in query)

    Like left side has {3, 5} with first non-positive integer as 1. Right side has {1, 2} with first non-positive integer as 3

»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

If I understood the problem correctly you can find the position of all the missing positive integers in o(n) and store them in a set then when you have a query just do a lower bound on that set.

Again I might have understood the problem incorrectly can you define what is the first missing intger ?

  • »
    »
    7 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    example if array is [1,3,4,5] then the smallest positive integer which is missing is 2

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Ohhh sorry I understood it wrong I thought you want the first integer that was outta place not the smallest

»
7 years ago, # |
  Vote: I like it +55 Vote: I do not like it

What I am going to explain is an offline O(Q * logN) solution.

Let's observe that the answer for a query will never be more than N + 1. This means that we can simply ignore numbers greater than N + 1 in the steps I am going to explain.

Now, let's loop from left to right (say we use i for the loop) and at any point of time, we store the last occurrence of each value last[value] (considering only values  ≤ N + 1). Let's take a look at some query [L;R] for which R = i. The answer will obviously be the smallest integer for which last[value] < L.

In order to find the smallest integer for which last[value] < L, we can build a segment tree, in which the indices correspond to value, the leaves store last[value] and the internal nodes store the minimum for their subtree. Here comes this fairly well-known binary search on segment tree. We start at the root. If the value stored in the left child of the current node is  < L, then we go to the left child. Otherwise, we visit the right child. We repeat this until we reach a leaf node and the index of this leaf node will be the answer for the query.

A really similar problem can be found here with the only difference that it asks for the smallest non-negative integer missing. I solved it recently, my code is here.

»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

try Mo's algorithm, i know that this algo can help a lot to solve this problem and it's easy to code

»
7 years ago, # |
Rev. 9   Vote: I like it +12 Vote: I do not like it

I know that you found the answer, but you can see this. (O(QlogN) with segment tree)

»
7 years ago, # |
  Vote: I like it +5 Vote: I do not like it

@tanmay did you got the interview call

»
7 years ago, # |
Rev. 2   Vote: I like it -8 Vote: I do not like it

[DELETED]

»
7 years ago, # |
Rev. 3   Vote: I like it +2 Vote: I do not like it

I think persistence segment tree will work in O(log N) per query.. after compressing the A[i] values. You must compress the A[i] values such that no 2-non consecutive values gets the consecutive values in the compressed array.

EDIT: I don't think it will be able to handle the duplicates. So, this solution will not work. Although an offline solution with persistent tree should work similar to the one proposed by @X-tazy.

»
6 days ago, # |
  Vote: I like it -11 Vote: I do not like it

I guess you can find the first missing positive integer in a range in O(log(N)) time by using binary search. You would have something like log(Q * log(N)).

  • »
    »
    6 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    it's not binary searchable :|

»
5 days ago, # |
  Vote: I like it 0 Vote: I do not like it

we can use a segment tree to find it.