viralm's blog

By viralm, history, 7 years ago, In English

Often when trying to solve a problem involving Binary Search or 2-Pointer Technique, I make off by one errors, and/or fail to handle edge cases. Even if I know that my solution might fail on a particular edge case, to correct it takes a lot of time. I would like to know a method/implementation, such that I can code up the solutions without having to worry about edge cases, etc. For Binary Search, I know of one such method where to avoid infinite loop, we can use the following code: Say we want to find the maximum index in an array which satisfies certain property -->

while(hi-lo>1)
{
    int mid=(lo+hi)/2;
    int chk=check(mid);
    if(chk==1) lo=mid;
    else hi=mid-1;
}
int ans;
if(check(hi)==1) ans=hi;
else ans=lo;

If any better approach is available(for Binary Search), you are welcome to comment. Also, can you give a good implementation for 2 — Pointer Technique.

Thanks in advance.

  • Vote: I like it
  • +4
  • Vote: I do not like it

»
7 years ago, # |
  Vote: I like it +4 Vote: I do not like it

Suppose we want to find the maximum index at which function returns a true value.

while(lo<=hi)
{
   mid=(lo+hi)/2;
   if(f(mid)) ans=mid, lo=mid+1;
   else hi=mid-1;
}
print ans
  • »
    »
    7 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Your code gets into an infinite loop when lo==hi.

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it +4 Vote: I do not like it

      It won't. Depending on what f(mid) returns either lo will be incremented or hi will be decremented which will make lo<=hi false.

»
7 years ago, # |
  Vote: I like it +3 Vote: I do not like it
»
7 years ago, # |
Rev. 2   Vote: I like it +23 Vote: I do not like it

I'd like to copy to my a part of my Quora answer almost verbatim here:

The most common mistake in implementing binary search is trying to remember or guess off-by-ones, correct termination conditions and pre-checks instead of understanding invariant of the algorithm. Some peoplestick to a specific implementation of binary search. But once you have an invariant and follow it, you won't make a mistake and will be able to write and understand any kind of binary search. Moreover, you will be able to find bugs easily.

Example: I want to find first element  ≥ X in a sorted array. Invariant: I have a interval from L to R such that aL < X and aR ≥ X. Then, after I check element M in between, I set either L or R to M, preserving the invariant. Loop is terminated when answer is obvious — that is, there are no elements strictly between L and R, and answer is R. Oh, and initialization is easy as well: I can assume that my array has  - ∞ before the beginning and  + ∞ after the end. As I will never read these elements, I just initialize L =  - 1 and R = l, where l is length of the array.

Another example: I want to find element X. Invariant: a[L]<X and a[R]>X. Initialization: same. If aM is X, I return answer, otherwise I change L or R. If there are no elements between L and R, there is no answer.

So, my recommendation: basically, understand what exactly your binary search is required to return and what whether there are any "corner" cases or cases where answer is non-existent. If there are, you're walking on the edge and have to be extra careful with invariants.

Typically, you binary search problem should be formulated in the following form: there is a predicate f(x) such that it does not hold for some prefix of the array (possibly empty) and it holds for the remaining suffix, binary search is to find that border. If it's your case, then invariant is simple: f(aL) = 0, f(aR) = 1. Say, C++'s lower_bound and upper_bound's invariants are formulated in that way: lower_bound returns first element which is  ≥ X (assuming that there is  + ∞ after array's end), upper_bound return first element which is  > X (under the same assumption). No corner cases.

  • »
    »
    7 years ago, # ^ |
    Rev. 2   Vote: I like it +8 Vote: I do not like it

    BTW, after you formulated predicate, you may often use std::partition_point