SlavicG's blog

By SlavicG, history, 20 months ago, In English

Introduction

Hi everyone, because we noticed there was no good tutorial in English explaining this idea, KrowSavcik and I have decided to write a tutorial blog on it!

This idea finds the values of each element from a binary array using $$$O(N / log(N))$$$ sum queries, more specifically it solves problems of the type:

Given a binary array $$$b$$$ (an array consisting only of zeroes and ones) of $$$n$$$ elements, you can ask queries of the following format:

What is the sum of the subsequence of the values on positions: $$$p_1, p_2, ..., p_k$$$. In other words you will give the subsequence $$$p_1, p_2, ..., p_k$$$ to the interaction and it will return you the value of ($$$b_{p_1} + b_{p_2} + ... + b_{p_k}$$$).

And after asking $$$O(N / log(N))$$$ queries we should be able to tell the value of each index.

The technique was mentioned here too but it was not as detailed and not easy to find.

Concept

The idea is to use a divide and conquer-like approach. Because it's not an intuitive concept, firstly I will have to add some simple notations, so it will be easier to understand it.

  • $$$x_i$$$ — refers to the position $$$i$$$
  • $$$A$$$ — any capital letter (except S) refers to a set of points $$$x_i$$$
  • $$$v_{A} = \sum b_{x_i}$$$ for $$$x_i \in A$$$
  • $$$k$$$ — the layer we are currently considering
  • $$$S_i$$$ — Set of queries

Also you should keep in mind that indices are enumerated from 0.

Also keep in mind that even though it's said the queries are used, we actually only query at the end, and we build up our set of queries first for multiple layers first.

As it is a divide and conquer-like approach we will work with layers. Let's say that for layer $$$k$$$ we need $$$2^k$$$ queries and that from it we can get the value of $$$f_k$$$ elements. Then $$$f_{k+1} = 2 \cdot f_k + 2^k - 1$$$.

First of all, let's set our base case, which is $$$k = 0$$$. So, for $$$k = 0, f_0 = 1$$$ and the query set will only be {$$$0$$$}, so we know a single element using a single query.

The block $$$f_{k+1}$$$ is formed from $$$2$$$ blocks of size $$$f_k$$$ and $$$2^k -1$$$ additional elements in this order. Let's say $$$k_1$$$ represents the first block of $$$f_k$$$ elements and the $$$k_2$$$ the second such block.

The first query is used to get the sum on $$$[f_k, 2 \cdot f_k)$$$ — the sum of the second block. Then we add two new queries for each non last query in $$$S_{k_1}$$$ and $$$S_{k_2}$$$. First one is $$$S_{k_1}[i] \cup S_{k_2}[i]$$$. Second one is $$$S_{k_1}[i] \cup ([f_k, 2 \cdot f_k) / S_{k_2}[i]) \cup x_{(2 \cdot f_k + i)}$$$.

The last query is for entire range $$$[0, f_{k+1})$$$.

I think it's easy to see why we now have $$$2^{k+1}$$$ queries. The harder part is to understand why we don't lose any value in this process, and how we can solve it recursively. In fact, having answered all the $$$S_{k+1}$$$ queries, we can calculate all the $$$v_{S_{k_1}[i]}$$$ and $$$v_{S_{k_2}[i]}$$$.

Now, when we reach a $$$k$$$ with a value of $$$f_k >= n$$$ we can stop there. Let's assume $$$n = f_k$$$ since it will be easier to work with it (when $$$n$$$ is smaller than $$$f_k$$$ we can just think of it as appending $$$f_k - n$$$ zeroes at the end since they won't influence the sum at all). There is a more elegant way to form $$$n$$$ from blocks of different sizes, but it doesn't enter the scope of this blog and it's not that hard to implement using offsets. Using the set of queries responsible for the $$$k$$$-th layer we can in fact now reconstruct the whole sequence, now recursively going from the $$$k$$$-th layer to the ($$$k - 1$$$)-th one (but consider each layer can have multiple blocks).

First of all, the only things we care about when we are in the $$$k$$$-th layer are:

  • The query set for the corresponding block of the corresponding layer.
  • The block we are currently at (can be dealt with using an offset value in the recursion).

So we will store them when going recursively.

Firstly, let's again set our base case: $$$k = 0$$$. We are now sure that only one element is responsible for this block from this layer, so we can just set the value of the $$$b_x$$$-th bit (where $$$x$$$ is some offset value we use to keep track of the block) to $$$v_{S_0}$$$ (since that's the sum for a single element which we've seen at the build-up).

Now, since the base case is already dealt with, here's how we will go to the ($$$k - 1$$$)-th layer:

We will need to reconstruct the previous query sets for the first block and the second block of size $$$f_{k - 1}$$$, and set the values for the other $$$2^{(k - 1)} - 1$$$ values respectively (since they aren't part of any of the blocks they shouldn't be part of the recursion either).

Let's denote the numbers of $$$1$$$-s in $$$[f_k, 2 \cdot f_k)$$$ with $$$c$$$. It's obvious that $$$c = v_{S[0]}$$$. We will now go through every pair of queries, starting from 1. That means we will be analyzing queries $$$S_1[i] \cup S_2[i]$$$ and $$$S_1[i] \cup ([f_{k-1}, 2 \cdot f_{k-1}) / S_2[i]) \cup x_{(2 \cdot f_{k-1} + i)}$$$.

  • $$$v_{S[2 \cdot i + 1]} = v_{S_1[i]} + v_{S_2[i]}$$$
  • $$$v_{S[2 \cdot i + 2]} = v_{S_1[i]} + c - v_{S_2[i]} + b_{2 \cdot f_{k-1} + i}$$$

In this case we have to calculate 3 values: $$$v_{S_1[i]} , v_{S_2[i]} , b_{2 \cdot f_{k-1} + i}$$$.

  • $$$v_{S_1[i]} = \lfloor\frac{v_{S[2 \cdot i + 1]} + v_{S[2 \cdot i + 2]} - c}{2}\rfloor$$$
  • $$$v_{S_2[i]} = \lceil\frac{v_{S[2 \cdot i + 1]} - v_{S[2 \cdot i + 2]} + c}{2}\rceil$$$
  • $$$b_{2 \cdot f_{k-1} + i} = (v_{S[2 \cdot i + 1]} + v_{S[2 \cdot i + 2]} - c) \land 1$$$

The only remaining queries to answer are $$$v_{S_1[2^{k-1}]}$$$ and $$$v_{S_2[2^{k-1}]}$$$ as we didn't add them in $$$S$$$.

  • $$$v_{S_2[2^{k-1}]} = c = v_{S[0]}$$$
  • $$$v_{S_1[2^{k-1}]} = v{S[2^k]} - c - \sum_{i = 0}^{2^{k-1}-1} b_{2 \cdot f_{k-1} + i} $$$

So, with all this calculated we are ready to go down to the previous layer's blocks.

Visual representation

Here's a visual representation of the queries we use for going up to the next layer:

  • The green squares represent the queries responsible for the first block of the previous layer.
  • The orange squares represent the queries responsible for the second block of the previous layer.
  • The blue squares represent the queries that query the whole second block excluding the elements from the query of the second block.
  • The red squares represent the last $$$2^k - 1$$$ bits.

Count of queries (a bit optimized)

Here's a visual representation of how the count of queries grows as $$$n$$$ increases, we notice that the count of queries is around $$$2.67 \cdot n / log2(n)$$$.

Some Code

Keep in mind this code is pretty inefficient and is just used to explain the solution better (it's still functional though)

Firstly, let's set some helper functions to work easier with our query sets:

Helper Functions

Now, here's how we do the build-up part:

Build-up

And the final part — recursively going from the $$$k$$$-th layer to the ($$$k - 1$$$)-th till each element is visited.

The build-down
  • Vote: I like it
  • +168
  • Vote: I do not like it

| Write comment?
»
20 months ago, # |
  Vote: I like it +4 Vote: I do not like it

Thanks a lot for the post! I wish you silver medal at IOI and a lot of health, love, esteem and respect from your loved ones, the achievement of endless goals, beautiful achievements, to have only unforgettable moments and moments, and to get what you really want, because this is the only way to fulfillment, achievement and happiness!

»
20 months ago, # |
Rev. 2   Vote: I like it +1 Vote: I do not like it

Problems solvable with this trick:

»
20 months ago, # |
  Vote: I like it 0 Vote: I do not like it

This is why I upVoted SlavicG.

»
20 months ago, # |
Rev. 6   Vote: I like it 0 Vote: I do not like it

Resolved

  • »
    »
    20 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    These look fine on my end. Checked on Firefox,Chrome,Edge.

»
20 months ago, # |
  Vote: I like it -39 Vote: I do not like it

Recently lost the spot on top contributors and now a pathetically useless blog post? Coincidence? I think not.

Marinush

»
20 months ago, # |
  Vote: I like it +3 Vote: I do not like it

Great blog, thanks!

Some typos:

1) $$$b_{2⋅f_{k−1}+i}=(v_{S[2⋅i+1]}+v_{S[2⋅i+2]}−c)∧1$$$

It should be &, not ^.

2) "It's obvious that $$$c=S[0]$$$."

$$$c = v_{S[0]}$$$, not $$$S[0]$$$.

  • »
    »
    20 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thanks, fixed! As for point 1, $$$\land$$$ is correct in this case since it's the sign for logical and.

    • »
      »
      »
      20 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      ohhhh, sorry for being stupid, but it's a bit confusing :(

»
20 months ago, # |
  Vote: I like it +29 Vote: I do not like it

Such an amazing post, thanks! Now I know how to do IOI20 Mushrooms with 203 queries.