[Tutorial] A Bit of Math, Historic Sum, and Range Add Range Sum Binary Indexed Tree (Fenwick Tree)

Revision en4, by FHVirus, 2022-02-14 06:22:20

Hello CodeForces! Since my contribution is negative currently (-4) due to a stupid comment I made, I would like to write a little tutorial to earn some contribution stonks.

Anyways, here's the outline of this tutorial:

  1. We're going to solve a simple problem and a not-so-simple problem, then
  2. use the formula derived from the problems to solve "Historic Sum" problem and maintain it on data structures like segment trees.
  3. Use the same technique to solve the "Range Add Range Sum" problem with BIT. Yes, it's possible!

Disclaimer: I did not come up with the method myself. Tutorials on historic sum problem can be found here from jiry_2's blog, and range add range query BIT can also be found on the Internet. I forgot the link to the original post, but the same technique can be found in several different Chinese blogs.

This blog serves as an implementation of Feynman's learning technique, and I hope that it makes some concepts and reasoning behind the cool method much clearer for myself and anyone reading this blog.

It will be a bit long (I want to make everything clear for even a beginner), so maybe you'll want to do some binary jumping and skip ahead for certain parts of the explanation while reading.

Note:

  1. I'm using 1-based index on arrays and closed bounds for intervals.
  2. Section 2 and 3 are independent.
  3. The markdown and math blocks might have some problem, so here's a hackmd version.

1. The essence.

1.1 The simple problem

The problem looks like this:

Given an interger array $$$A$$$ of length $$$n$$$.
We define array $$$B$$$ where $$$B_i = \sum_{j = 1}^{i}{A_j}$$$.
Then we define another array $$$C$$$ where $$$C_i = \sum_{j = 1}^{i}{B_j}$$$.
Then there's $$$q$$$ queries, each of two types:
1. Print $$$C_n$$$.
2. Add $$$x$$$ to $$$A_k, k \in [1, n]$$$.

If there's only queries of type 1, two times of prefix sum will do the job. But what about modifications?

The key observation here is to play with the summations.

$$$ C_i = \sum_{j = 1}^{i}{B_j} = \sum_{j = 1}^{i}{\sum_{k = 1}^{j}{A_k}} \\ = \begin{matrix} (A_1) & + \\ (A_1 & + & A_2) & + \\ (A_1 & + & A_2 & + & A_3) & + \\ \vdots & & \vdots & & \vdots \\ (A_1 & + & A_2 & + & A_3 & + & \cdots & + & A_i) \end{matrix} $$$

When we switch our view from horizontal to vertical, we can se that there's no need to do two times of summation (prefix sum). With a bit of math, we can count the number of appearance of each $$$A_i$$$ in the equation (which is also the height of column $$$i$$$) and thus get a simpler formula:

$$$ C_i = \sum_{j = 1}^{i}{B_j} = \sum_{j = 1}^{i}{\sum_{k = 1}^{j}{A_k}} = \sum_{j = 1}^{i}{(i - j + 1) \cdot A_j} $$$

And for some reason, let's call the $$$(i - j + 1)$$$ term the "contribution multiplier" of $$$A_j$$$ to $$$C_i$$$.

There's two important property of this formula:

  1. It can be done in one pass for loop.
  2. Every element $$$A_i$$$ is independent. That is, for queries of type 2, a modification will be done on single point.

And the solution to the problem looks like this:

  1. Calculate $$$C_n$$$, and store the value sum in a single int or long long.
  2. sum += x * (n - k + 1);

Complexity: $$$O(n + q)$$$ time, $$$O(1)$$$ space.

1.2 The not-simple problem

The problem is pretty much the same, but for type 1 queries, we need to answer $$$C_i$$$ for arbitrary index $$$i$$$ instead of fixed index $$$n$$$. This means that the contribution multiplier is no longer fixed for one element in array $$$A$$$.

And the key is to play with summations again. But this time, since the difficulty stems from the index of queried $$$C_i$$$, let's try to make it independent from that of $$$A_j$$$:

$$$ C_i = \sum_{j = 1}^{i}{(i - j + 1) \cdot A_j} = (i + 1) \cdot \sum_{j = 1}^{i}{A_j} - \sum_{j = 1}^{i}{j \cdot A_j} $$$

Putting the modification aside, all we need to do is maintain the prefix sum $$$B_i = \sum_{j = 1}^{i}{A_j}$$$ and another prefix sum $$$B'_i = \sum_{j = 1}^{i}{j \cdot A_j}$$$, and then $$$C_i$$$ will be $$$(i + 1) \cdot B_i - B'_i$$$.

What about modifications? We simply maintain $$$B$$$ and $$$B'$$$ on two Binary Indexed Trees (or one, it's probably faster and better), and the modification becomes two point add operations.

Complexity: $$$O(n + q \log n)$$$ time, $$$O(n)$$$ space.

Note: It's possible to initialize a BIT in linear time. Read this blog for more information.

The code is very simple, so I would like to be lazy here and omit it.


2. Historic sums & maintaining it on data structures.

The problem goes like this:

Given two identical arrays $$$D$$$ and $$$E$$$ of length $$$n$$$. There are two types of queries:
1. Add $$$x$$$ to $$$D[l, r]$$$. After that, add $$$D_i$$$ to $$$E_i$$$ for all $$$i \in [1, n]$$$.
2. Print the sum of $$$E[l, r]$$$.

It's a bit hard, so let's solve an easier version where $$$n = 1$$$ first. Hint: It's almost the same as the not-so-simple problem.

Solution

There's another inportant property here: if we modified $$$A_p$$$, we will never ask for $$$C_q$$$ where $$$p > q$$$. So all the information we need can be maintained by a single long long.

And now moving on to the original problem with bigger $$$n$$$. It's obvious that we just need to use a segment tree to maintain $$$B_i$$$ and $$$B'_i$$$ for every cell in $$$E$$$, and the modifications will be two ordinary range add queries.

Another method of solving this problem is to maintain a tag of "how many modifications have passed after last pushed down the addition tag." It's more complicated but can also be used in more situations. I'm not going to explain it here because it's irrelevent the main idea of this blog, and if you want to learn about that, please read smax's blog.

For more sophisticated problems about historic sum, refer to jiry_2's blog.


3. Range Add Range Sum BIT

First, let's convert the problem into suffix add prefix sum.

Given an array $$$F$$$ of length $$$n$$$:
1. Add(p, x): Add $$$x$$$ to F_i for all $$$i \in [p, n]$$$.
2. Sum(p): Print $$$\sum_{i = 1}^{p}{F_i}$$$.

Define an array $$$G$$$ where $$$G_i = \sum_{j = 1}^{i}{F_j}$$$. For each query of type 1, it will contribute $$$(i - p + 1) \cdot x$$$ to $$$G_i$$$ for every $$$i >= p$$$. Doesn't it seem similar to the not-so-simple problem?

And voilà! We can do range add range sum with BIT! There's not much to explain, so here's the code.

Code

And that's it for this tutorial!

Special thanks to 8e7 for proofreading.

Tags binary indexed tree, math, data structures, 8e7orz

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en5 English FHVirus 2022-02-14 06:22:50 0 (published)
en4 English FHVirus 2022-02-14 06:22:20 142 (saved to drafts)
en3 English FHVirus 2022-02-13 13:04:37 227 (published)
en2 English FHVirus 2022-02-13 12:56:09 681
en1 English FHVirus 2022-02-13 12:47:00 8268 Initial revision (saved to drafts)