ramentaneja's blog

By ramentaneja, history, 9 months ago, In English

Hello everyone,

I have been using Binary Search for a while, but it still confuses me as to when I should use <= and when I should use < and how I should be updating my low and high variables accordingly.

I am not sure if this has already been answered, I was unable to find an answer that solved my doubt regarding how to update low and high variables.

It's just something I am struggling to wrap my head around, any help regarding this would be highly appreciated. Cheers!

  • Vote: I like it
  • +10
  • Vote: I do not like it

»
9 months ago, # |
Rev. 2   Vote: I like it +17 Vote: I do not like it

I'm writing binary search in this way: if you want to find the lowest $$$x$$$ satisfying $$$f(x)$$$:

int l=lowest_possible_value-1,r=highest_possible_value,m;
while(r-l>1)
{
    m=l+(r-l)/2;if(f(m))r=m;else l=m;
}
return r;

and vice versa if you want to find the highest $$$x$$$ satisfying $$$g(x)$$$:

int l=lowest_possible_value,r=highest_possible_value+1,m;
while(r-l>1)
{
    m=l+(r-l)/2;if(g(m))l=m;else r=m;
}
return l;
  • »
    »
    9 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Does this always work? or do you modify according to problems as well?

    • »
      »
      »
      9 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Works almost always (l = m , r = m). You just have to find the best initial values for l and r for this.

      • »
        »
        »
        »
        9 months ago, # ^ |
        Rev. 2   Vote: I like it 0 Vote: I do not like it

        The key idea here is keeping invariants, this is precisely what's happening in the code. In the case of finding the smallest $$$x$$$ satisfying $$$f(x)$$$ we initially choose $$$l$$$ and $$$r$$$ such that $$$f(l)$$$ does not hold and $$$f(r)$$$ holds. Throughout the whole binary search, we always keep $$$l$$$ and $$$r$$$ that way. In the case of maximizing $$$x$$$, we keep the invariant that $$$f(l)$$$ holds and $$$f(r)$$$ does not hold (or in the code, $$$g(l)$$$ and $$$g(r)$$$).

        • »
          »
          »
          »
          »
          9 months ago, # ^ |
          Rev. 2   Vote: I like it 0 Vote: I do not like it

          The way I think of it is this: at any point, we have $$$l, r$$$ such that $$$f$$$ is true on $$$(l_0, l]$$$ and false on $$$[r, r_0)$$$, and we want to bring one of these "borders" towards the other. With this, the base case $$$(l, r) = (l_0, r_0)$$$ is trivially true. The termination condition says $$$f(l_0, x]$$$ is true and $$$f[x + 1, r_0)$$$ is false, so either $$$x = l_0$$$ or $$$x$$$ is such that the non-empty subarray $$$[l_0 + 1, x]$$$ is true, and either $$$x + 1 = r_0$$$ or $$$x$$$ is such that the non-empty subarray $$$[x + 1, r_0 - 1]$$$ is true. Thus we can simply set $$$l_0$$$ and $$$r_0$$$ to be one before and one after the search array endpoints (or the places where we definitely know that $$$f$$$ is true and false respectively).

    • »
      »
      »
      9 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      To be honest, I haven't seen a binary search problem where the codes don't work. Tell me if you find any!

»
9 months ago, # |
Rev. 2   Vote: I like it -14 Vote: I do not like it

.

»
9 months ago, # |
  Vote: I like it +1 Vote: I do not like it

Try to formally declare your state, what $$$l$$$ and $$$r$$$ means.

Let's declare some notation:

$$$a[i...j]$$$ = Subarray $$$(a[i], a[i + 1], a[i + 2], ..., a[j])$$$

$$$a[i...]$$$ = Subarray $$$(a[i], a[i + 1], a[i + 2], ...,$$$ till end of array $$$)$$$

$$$a[...j]$$$ = Subarray $$$($$$ From start of array $$$..., a[j - 1], a[j])$$$

$$$a[i...i]$$$ = Subarray having single term $$$(a[i])$$$

Empty Subarray: $$$a[0...-1]$$$ or $$$a[1...0]$$$ or ... (Basically when $$$i > j$$$, it denotes an empty subarray)

Now let's find the logic for the lower bound of an array sorted in increasing order.

Function $$$lowerBound(a, x)$$$: Returns the smallest index having element greater than or equal to $$$x$$$.

The function will look something like this:

int lowerBound(int[] a, int x) {
	int n = a.length;
	int l = _;
	int r = _;

	while(_) {
		int m = (l + r) / 2; // Or maybe ceil(l + r) / 2
		if(a[m] >= x) {
			l = _;
			// OR
			r = _;
		} else {
			l = _;
			// OR
			r = _;
		}
	}

	return _;
}

Let $$$l$$$ and $$$r$$$ be defined as:

$$$a[...(l - 1)]$$$: All elements are smaller than $$$x$$$

$$$a[l...(r - 1)]$$$: Unchecked portion of array

$$$a[r...]$$$: All elements are greater than or equal to $$$x$$$

First, think about the initial value of $$$l$$$ and $$$r$$$.

Initialization

Now, let's see how we should update the values $$$l$$$ and $$$r$$$ at every iteration.

Maintenance

How long the loop should run?

Termination

If at any point, I ask you the current best answer you have, what would it be?

Answer Index

The final code will be:

int lowerBound(int[] a, int x) {
	int n = a.length;
	int l = 0;
	int r = n;

	while (l < r) {
		int m = (l + r) / 2;
		if (a[m] >= x) {
			r = m;
		} else {
			l = m + 1;
		}
	}

	return r;
}

This is one of the possible codes on lowerBound. You can have different codes on considering different states.

The same idea can be applied on any binary search problem.

When solving problems using "Binary search on answer", you'll get one of the two situations:

...TTTTTFFFFF...

OR

...FFFFFTTTTT...

You can make a binary search algorithm for the last true or first false in the first case, or the last false or first true in the second case.

»
9 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Binary search is a search algorithm. As long as you don't lose whatever you're searching for, then you win.

In binary search, the search domain (i.e. the answer you're looking for) is in the [low, high] region. You check the answer for mid = (low + high)/2. Is mid a valid answer? Would you want to discard mid? If yes for the first question and no for the second, don't lose it. Otherwise, feel free to drop it.

Key point is, again, don't lose sight of what you're "searching" for. Only discard what's guaranteed to not be a possible answer, and don't ever let the answer escape the [low, high] zone.

»
9 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I prefer checking low <= high. If the value of high is initially good then after doing binary search return low, otherwise return high

»
9 months ago, # |
  Vote: I like it +9 Vote: I do not like it

The diffrence between <= and < is when using <=, the binary search range is $$$[L,R]$$$, and when using <, the range is $$$[L,R)$$$.

I perfer using <= and use another variable res to store the possible answer.

int L, R, res;
while(L <= R) {
    int mid = (L + R) / 2;
    if(check(mid)) res = mid, L = mid + 1;
    else R = mid - 1;
}
»
9 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I wrote a blog a while ago which has a few sections that talk about the ways people implement binary search, in particular, about implementations that use $$$<$$$ and implementations that use $$$\le$$$.