[Tutorial] Path sum queries on a tree using a Fenwick tree

#	User	Rating
1	jiangly	3640
2	Benq	3593
3	tourist	3572
4	orzdevinwang	3561
5	cnnfls_csy	3539
6	ecnerwala	3534
7	Radewoosh	3532
8	gyh20	3447
9	Rebelz	3409
10	Geothermal	3408

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

Introduction

I've come across several problems that require you to be able to efficiently find the sum of some values on a path from $$$A$$$ to $$$B$$$ in a tree. One can solve these problems using tree decomposition techniques like heavy-light decomposition or centroid decomposition, but that's a bit of an overkill and not very elegant in my opinion.

In this blog post, I will describe a way to solve these types of problems using just a DFS, a Fenwick tree, and LCA (as well as a way to solve JOIOC 2013 Synchronization this way).

Note: Whenever I talk about edge $$$(u, v)$$$ in this post, I assume that $$$u$$$ is the parent of $$$v$$$.

Prerequisites

DFS
Preorder traversal (DFS order)
Fenwick tree
Binary lifting and LCA

The idea

Say you have the following problem:

Given a tree with $$$N$$$ nodes, each node $$$i$$$ has a value $$$a_i$$$ that is initially 0. Support the following 2 types of operations in $$$O(\log N)$$$ time:

Increase the value of $$$a_i$$$ by $$$X$$$

Find the sum of $$$a_i$$$ on the path from $$$u$$$ to $$$v$$$ for 2 nodes $$$u$$$ and $$$v$$$

First, we flatten the tree using a preorder traversal. Let the time we enter node $$$i$$$ be $$$tin_i$$$ and the time we exit it be $$$tout_i$$$. Additionally, let $$$b$$$ be an array/Fenwick tree of size $$$2N$$$.

If you're familiar with LCA, you'll know that node $$$u$$$ is an ancestor of node $$$v$$$ if and only if $$$tin_u \leq tin_v$$$ and $$$tout_u \geq tout_v$$$.

If you're familiar with range updates and point queries with Fenwick trees, you'll know that if we want to increase the range $$$[A, B]$$$ by $$$X$$$ in an array/Fenwick tree $$$BIT$$$, then we increase $$$BIT[A]$$$ by $$$X$$$ and decrease $$$BIT[B + 1]$$$ by $$$X$$$. Then when we want to find the value of the element at $$$C$$$, we simply query the sum of $$$BIT[i]$$$ for each $$$i$$$ from 0 to $$$C$$$. This works because if $$$C$$$ isn't inside the range of an update, the 2 values we update "cancel out" in the query.

We can combine the 2 ideas above to work on trees — if we want to increase the value of node $$$u$$$ by $$$X$$$, we increase $$$b[tin_u]$$$ by $$$X$$$ and decrease $$$b[tout_u + 1]$$$ by $$$X$$$. Then when we want to find the sum of nodes on the path between node $$$v$$$ and the root, we simply query the sum of $$$b[i]$$$ for each $$$i$$$ from 0 to $$$tin_v$$$. This works because node $$$w$$$ only contributes to the sum if $$$w$$$ is an ancestor of $$$v$$$.

We can then use LCA and the principle of inclusion-exclusion to find the sum of nodes/edges on the path between nodes $$$u$$$ and $$$v$$$.

This idea also works if we want to sum edges instead of nodes — when we update edge $$$(u, v)$$$, update $$$b[tin_v]$$$ and $$$b[tout_v + 1]$$$ and make queries similarly.

Problem 1 — Infoarena Disconnect

Here's the gist of the problem:

You're given a tree with $$$N$$$ nodes. Process $$$M$$$ of the following queries online:

Delete the edge $$$(u, v)$$$ from the path.
Determine whether there is a path from $$$u$$$ to $$$v$$$.

$$$N \leq 10^5$$$ and $$$M \leq 5 \cdot 10^5$$$

Solution

Let the value of each edge in the tree initially be 0. When we "delete" an edge, we just update its value to be 1.

Notice how there is a path from $$$u$$$ to $$$v$$$ if and only if the sum of edges on the path between $$$u$$$ and $$$v$$$ is 0.

We can then apply the above idea and solve this problem in $$$O(M \log N)$$$ time.

Alternatively,

Find the lowest ancestor $$$l_u$$$ of $$$u$$$ such that the sum of the edges between $$$u$$$ and $$$l_u$$$ is 0
We can do this using binary lifting and our trick
Find $$$l_v$$$ similarly for $$$v$$$.
Check whether $$$l_u$$$ is an ancestor of $$$v$$$ and whether $$$l_v$$$ is an ancestor of $$$u$$$.

This solution runs in $$$O(M \log^2 N)$$$ time.

My code for this problem

#include <bits/stdc++.h>
#define FOR(i, x, y) for (int i = x; i < y; i++)
typedef long long ll;
using namespace std;

int n, m;

vector<int> graph[100001];
int timer = 1, tin[100001], tout[100001];
int anc[100001][18];

void dfs(int node = 1, int parent = 0) {
    anc[node][0] = parent;
    for (int i = 1; i < 18 && anc[node][i - 1]; i++)
        anc[node][i] = anc[anc[node][i - 1]][i - 1];

    tin[node] = timer++;
    for (int i : graph[node]) if (i != parent) dfs(i, node);
    tout[node] = timer;
}

int bit[100001];

void update(int pos, int val) { for (; pos <= n; pos += (pos & (-pos))) bit[pos] += val; }

int query(int pos) {
    int ans = 0;
    for (; pos; pos -= (pos & (-pos))) ans += bit[pos];
    return ans;
}

int find_ancestor(int node) {
    int lca = node;
    for (int i = 17; ~i; i--) {
        if (anc[lca][i] && query(tin[anc[lca][i]]) == query(tin[node])) lca = anc[lca][i];
    }
    return lca;
}

int main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    freopen("disconnect.in", "r", stdin);
    freopen("disconnect.out", "w", stdout);
    cin >> n >> m;
    FOR(i, 1, n) {
        int a, b;
        cin >> a >> b;
        graph[a].push_back(b);
        graph[b].push_back(a);
    }
    dfs();

    int v = 0;
    while (m--) {
        int t, x, y;
        cin >> t >> x >> y;
        int a = x ^ v, b = y ^ v;

        if (t == 1) {
            if (anc[b][0] == a) swap(a, b);
            update(tin[a], 1);
            update(tout[a], -1);
        } else {
            if (find_ancestor(a) == find_ancestor(b)) {
                cout << "YES\n";
                v = a;
            } else {
                cout << "NO\n";
                v = b;
            }
        }
    }
    return 0;
}

Problem 2 — JOIOC 2013 Synchronization

Here's the gist of the problem:

You're given a tree with $$$N$$$ nodes. Each edge is initially deactivated and each node stores a unique piece of information.

There are $$$M$$$ events. During event $$$i$$$, the state of exactly 1 edge is toggled.

When the edge $$$(u, v)$$$ becomes activated, $$$u$$$'s component and $$$v$$$'s component merge and "sync up". All nodes in the merged component will contain all the information stored on the other nodes in the component.

After all these events, you want to know how many unique pieces of information are stored in each node.

$$$N \leq 10^5$$$ and $$$M \leq 2 \cdot 10^5$$$

Solution

Firstly, root the tree arbitrarily. Let $$$a_i$$$ be the amount of unique information stored on node $$$i$$$. Let $$$c_i$$$ be the amount of unique information stored on node $$$i$$$ right before the edge from node $$$i$$$ to its parent was deactivated.

Notice how if we make the edge $$$(u, v)$$$ activated, then the new amount of information on all nodes in the merged component is $$$a_u + a_v - c_v$$$.

This gives us a way to find the amount of information on the merged component, but we still need a way to set each node in the component to have that amount of information.

Fortunately, we don't have to do that!

Since each node in a component has the same amount of information, we can just store that amount in the "root" (i.e. the lowest node) of the component. Let the root of $$$i$$$'s component be $$$root[i]$$$. The amount of information stored on $$$i$$$ is thus $$$a_{root[i]}$$$.

When we deactivate the edge $$$(u, v)$$$, $$$v$$$ becomes the root of its new component and $$$root[u]$$$ doesn't change. This means we can simply set $$$a_v$$$ and $$$c_v$$$ to equal $$$a_{root[u]}$$$ when that happens (we don't care what $$$a_v$$$ and $$$c_v$$$ were before this).

But how do we find the root of a component?

$$$root_u$$$ is the lowest ancestor of $$$u$$$ such that the path from $$$u$$$ to $$$root_u$$$ contains no deactivated edges. We can thus use the same idea we used for the previous problem, but this time, all edges initially have their weight equal to 1 instead of 0.

Using binary lifting, we can find the root of any component in $$$O(\log^2 N)$$$ time.

This solution runs in $$$O(M \log^2 N)$$$ time.

My code for this problem

#include <bits/stdc++.h>
#define FOR(i, x, y) for (int i = x; i < y; i++)
typedef long long ll;
using namespace std;

int n, m, q;
bool active[100001];
vector<int> graph[100001];
pair<int, int> edges[200001];

int info[100001], last_sync[100001];

int timer = 1, tin[100001], tout[100001];
int anc[100001][20];

void dfs(int node = 1, int parent = 0) {
    anc[node][0] = parent;
    for (int i = 1; i < 20 && anc[node][i - 1]; i++) {
        anc[node][i] = anc[anc[node][i - 1]][i - 1];
    }

    info[node] = 1;

    tin[node] = timer++;
    for (int i : graph[node]) if (i != parent) dfs(i, node);
    tout[node] = timer;
}

int bit[100001];

void update(int pos, int val) { for (; pos <= n; pos += (pos & (-pos))) bit[pos] += val; }

int query(int pos) {
    int ans = 0;
    for (; pos; pos -= (pos & (-pos))) ans += bit[pos];
    return ans;
}

int find_ancestor(int node) {
    int lca = node;
    for (int i = 19; ~i; i--) {
        if (anc[lca][i] && query(tin[anc[lca][i]]) == query(tin[node])) lca = anc[lca][i];
    }
    return lca;
}

int main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    cin >> n >> m >> q;
    FOR(i, 1, n) {
        cin >> edges[i].first >> edges[i].second;
        graph[edges[i].first].push_back(edges[i].second);
        graph[edges[i].second].push_back(edges[i].first);
    }
    dfs();

    FOR(i, 1, n + 1) {
        update(tin[i], -1);
        update(tout[i], 1);
    }

    while (m--) {
        int x;
        cin >> x;
        int u = edges[x].first, v = edges[x].second;
        if (anc[u][0] == v) swap(u, v);

        if (active[x]) {
            info[v] = last_sync[v] = info[find_ancestor(u)];
            update(tin[v], -1);
            update(tout[v], 1);
        } else {
            info[find_ancestor(u)] += info[v] - last_sync[v];
            update(tin[v], 1);
            update(tout[v], -1);
        }
        active[x] = !active[x];
    }

    while (q--) {
        int x;
        cin >> x;
        cout << info[find_ancestor(x)] << '\n';
    }
    return 0;
}

Conclusion

I hope you've learned something from this tutorial. If anything is unclear, please let me know in the comments below.

Additional problems

COCI 2020 Putovanje
JOI 2015 Railroad but on a general tree instead of a line
USACO 2015 Max Flow
SACO 2015 Towers (Doesn't use a Fenwick tree but has a similar idea)

Rev.	By	When	Δ	Comment
en3	dolphingarlic	2020-06-09 19:09:43	2161	Tiny change: 'hen apply our trick and solve' -> 'hen apply the above idea and solve' (published)
en2	dolphingarlic	2020-06-09 18:10:34	667	Tiny change: 'initially 0. Support ' -> 'initially . Support '
en1	dolphingarlic	2020-06-08 21:16:15	8031	Initial revision (saved to drafts)

Introduction

Prerequisites

The idea

Problem 1 — Infoarena Disconnect

Solution

Problem 2 — JOIOC 2013 Synchronization

Solution

Conclusion

Additional problems

History