[Tutorial] Path sum queries on a tree using a Fenwick tree

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

Introduction

I've come across several problems that require you to be able to efficiently find the sum of some values on a path from $$$A$$$ to $$$B$$$ in a tree. One can solve these problems using tree decomposition techniques like heavy-light decomposition or centroid decomposition, but that's a bit of an overkill and not very elegant in my opinion.

In this blog post, I will describe a way to solve these types of problems using just a DFS, a Fenwick tree, and LCA (as well as a way to solve JOIOC 2013 Synchronization this way).

Note: Whenever I talk about edge $$$(u, v)$$$ in this post, I assume that $$$u$$$ is the parent of $$$v$$$.

Prerequisites

DFS
Preorder traversal (DFS order)
Fenwick tree
Binary lifting and LCA

The idea

Say you have the following problem:

Given a tree with $$$N$$$ nodes, each node $$$i$$$ has a value $$$a_i$$$ that is initially 0. Support the following 2 types of operations in $$$O(\log N)$$$ time:

Increase the value of $$$a_i$$$ by $$$X$$$

Find the sum of $$$a_i$$$ on the path from $$$u$$$ to $$$v$$$ for 2 nodes $$$u$$$ and $$$v$$$

First, we flatten the tree using a preorder traversal. Let the time we enter node $$$i$$$ be $$$tin_i$$$ and the time we exit it be $$$tout_i$$$. Additionally, let $$$b$$$ be an array/Fenwick tree of size $$$2N$$$.

If you're familiar with LCA, you'll know that node $$$u$$$ is an ancestor of node $$$v$$$ if and only if $$$tin_u \leq tin_v$$$ and $$$tout_u \geq tout_v$$$.

If you're familiar with range updates and point queries with Fenwick trees, you'll know that if we want to increase the range $$$[A, B]$$$ by $$$X$$$ in an array/Fenwick tree $$$BIT$$$, then we increase $$$BIT[A]$$$ by $$$X$$$ and decrease $$$BIT[B + 1]$$$ by $$$X$$$. Then when we want to find the value of the element at $$$C$$$, we simply query the sum of $$$BIT[i]$$$ for each $$$i$$$ from 0 to $$$C$$$. This works because if $$$C$$$ isn't inside the range of an update, the 2 values we update "cancel out" in the query.

We can combine the 2 ideas above to work on trees — if we want to increase the value of node $$$u$$$ by $$$X$$$, we increase $$$b[tin_u]$$$ by $$$X$$$ and decrease $$$b[tout_u + 1]$$$ by $$$X$$$. Then when we want to find the sum of nodes on the path between node $$$v$$$ and the root, we simply query the sum of $$$b[i]$$$ for each $$$i$$$ from 0 to $$$tin_v$$$. This works because node $$$w$$$ only contributes to the sum if $$$w$$$ is an ancestor of $$$v$$$.

We can then use LCA and the principle of inclusion-exclusion to find the sum of nodes/edges on the path between nodes $$$u$$$ and $$$v$$$.

This idea also works if we want to sum edges instead of nodes — when we update edge $$$(u, v)$$$, update $$$b[tin_v]$$$ and $$$b[tout_v + 1]$$$ and make queries similarly.

Problem 1 — Infoarena Disconnect

Here's the gist of the problem:

You're given a tree with $$$N$$$ nodes. Process $$$M$$$ of the following queries online:

Delete the edge $$$(u, v)$$$ from the path.
Determine whether there is a path from $$$u$$$ to $$$v$$$.

$$$N \leq 10^5$$$ and $$$M \leq 5 \cdot 10^5$$$

Solution

Let the value of each edge in the tree initially be 0. When we "delete" an edge, we just update its value to be 1.

Notice how there is a path from $$$u$$$ to $$$v$$$ if and only if the sum of edges on the path between $$$u$$$ and $$$v$$$ is 0.

We can then apply the above idea and solve this problem in $$$O(M \log N)$$$ time.

Alternatively,

Find the lowest ancestor $$$l_u$$$ of $$$u$$$ such that the sum of the edges between $$$u$$$ and $$$l_u$$$ is 0
We can do this using binary lifting and our trick
Find $$$l_v$$$ similarly for $$$v$$$.
Check whether $$$l_u$$$ is an ancestor of $$$v$$$ and whether $$$l_v$$$ is an ancestor of $$$u$$$.

This solution runs in $$$O(M \log^2 N)$$$ time.

My code for this problem

#include <bits/stdc++.h>
#define FOR(i, x, y) for (int i = x; i < y; i++)
typedef long long ll;
using namespace std;

int n, m;

vector<int> graph[100001];
int timer = 1, tin[100001], tout[100001];
int anc[100001][18];

void dfs(int node = 1, int parent = 0) {
    anc[node][0] = parent;
    for (int i = 1; i < 18 && anc[node][i - 1]; i++)
        anc[node][i] = anc[anc[node][i - 1]][i - 1];

    tin[node] = timer++;
    for (int i : graph[node]) if (i != parent) dfs(i, node);
    tout[node] = timer;
}

int bit[100001];

void update(int pos, int val) { for (; pos <= n; pos += (pos & (-pos))) bit[pos] += val; }

int query(int pos) {
    int ans = 0;
    for (; pos; pos -= (pos & (-pos))) ans += bit[pos];
    return ans;
}

int find_ancestor(int node) {
    int lca = node;
    for (int i = 17; ~i; i--) {
        if (anc[lca][i] && query(tin[anc[lca][i]]) == query(tin[node])) lca = anc[lca][i];
    }
    return lca;
}

int main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    freopen("disconnect.in", "r", stdin);
    freopen("disconnect.out", "w", stdout);
    cin >> n >> m;
    FOR(i, 1, n) {
        int a, b;
        cin >> a >> b;
        graph[a].push_back(b);
        graph[b].push_back(a);
    }
    dfs();

    int v = 0;
    while (m--) {
        int t, x, y;
        cin >> t >> x >> y;
        int a = x ^ v, b = y ^ v;

        if (t == 1) {
            if (anc[b][0] == a) swap(a, b);
            update(tin[a], 1);
            update(tout[a], -1);
        } else {
            if (find_ancestor(a) == find_ancestor(b)) {
                cout << "YES\n";
                v = a;
            } else {
                cout << "NO\n";
                v = b;
            }
        }
    }
    return 0;
}

Problem 2 — JOIOC 2013 Synchronization

Here's the gist of the problem:

You're given a tree with $$$N$$$ nodes. Each edge is initially deactivated and each node stores a unique piece of information.

There are $$$M$$$ events. During event $$$i$$$, the state of exactly 1 edge is toggled.

When the edge $$$(u, v)$$$ becomes activated, $$$u$$$'s component and $$$v$$$'s component merge and "sync up". All nodes in the merged component will contain all the information stored on the other nodes in the component.

After all these events, you want to know how many unique pieces of information are stored in each node.

$$$N \leq 10^5$$$ and $$$M \leq 2 \cdot 10^5$$$

Solution

Firstly, root the tree arbitrarily. Let $$$a_i$$$ be the amount of unique information stored on node $$$i$$$. Let $$$c_i$$$ be the amount of unique information stored on node $$$i$$$ right before the edge from node $$$i$$$ to its parent was deactivated.

Notice how if we make the edge $$$(u, v)$$$ activated, then the new amount of information on all nodes in the merged component is $$$a_u + a_v - c_v$$$.

This gives us a way to find the amount of information on the merged component, but we still need a way to set each node in the component to have that amount of information.

Fortunately, we don't have to do that!

Since each node in a component has the same amount of information, we can just store that amount in the "root" (i.e. the lowest node) of the component. Let the root of $$$i$$$'s component be $$$root[i]$$$. The amount of information stored on $$$i$$$ is thus $$$a_{root[i]}$$$.

When we deactivate the edge $$$(u, v)$$$, $$$v$$$ becomes the root of its new component and $$$root[u]$$$ doesn't change. This means we can simply set $$$a_v$$$ and $$$c_v$$$ to equal $$$a_{root[u]}$$$ when that happens (we don't care what $$$a_v$$$ and $$$c_v$$$ were before this).

But how do we find the root of a component?

$$$root_u$$$ is the lowest ancestor of $$$u$$$ such that the path from $$$u$$$ to $$$root_u$$$ contains no deactivated edges. We can thus use the same idea we used for the previous problem, but this time, all edges initially have their weight equal to 1 instead of 0.

Using binary lifting, we can find the root of any component in $$$O(\log^2 N)$$$ time.

This solution runs in $$$O(M \log^2 N)$$$ time.

My code for this problem

#include <bits/stdc++.h>
#define FOR(i, x, y) for (int i = x; i < y; i++)
typedef long long ll;
using namespace std;

int n, m, q;
bool active[100001];
vector<int> graph[100001];
pair<int, int> edges[200001];

int info[100001], last_sync[100001];

int timer = 1, tin[100001], tout[100001];
int anc[100001][20];

void dfs(int node = 1, int parent = 0) {
    anc[node][0] = parent;
    for (int i = 1; i < 20 && anc[node][i - 1]; i++) {
        anc[node][i] = anc[anc[node][i - 1]][i - 1];
    }

    info[node] = 1;

    tin[node] = timer++;
    for (int i : graph[node]) if (i != parent) dfs(i, node);
    tout[node] = timer;
}

int bit[100001];

void update(int pos, int val) { for (; pos <= n; pos += (pos & (-pos))) bit[pos] += val; }

int query(int pos) {
    int ans = 0;
    for (; pos; pos -= (pos & (-pos))) ans += bit[pos];
    return ans;
}

int find_ancestor(int node) {
    int lca = node;
    for (int i = 19; ~i; i--) {
        if (anc[lca][i] && query(tin[anc[lca][i]]) == query(tin[node])) lca = anc[lca][i];
    }
    return lca;
}

int main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    cin >> n >> m >> q;
    FOR(i, 1, n) {
        cin >> edges[i].first >> edges[i].second;
        graph[edges[i].first].push_back(edges[i].second);
        graph[edges[i].second].push_back(edges[i].first);
    }
    dfs();

    FOR(i, 1, n + 1) {
        update(tin[i], -1);
        update(tout[i], 1);
    }

    while (m--) {
        int x;
        cin >> x;
        int u = edges[x].first, v = edges[x].second;
        if (anc[u][0] == v) swap(u, v);

        if (active[x]) {
            info[v] = last_sync[v] = info[find_ancestor(u)];
            update(tin[v], -1);
            update(tout[v], 1);
        } else {
            info[find_ancestor(u)] += info[v] - last_sync[v];
            update(tin[v], 1);
            update(tout[v], -1);
        }
        active[x] = !active[x];
    }

    while (q--) {
        int x;
        cin >> x;
        cout << info[find_ancestor(x)] << '\n';
    }
    return 0;
}

Conclusion

I hope you've learned something from this tutorial. If anything is unclear, please let me know in the comments below.

Additional problems

COCI 2020 Putovanje
JOI 2015 Railroad but on a general tree instead of a line
USACO 2015 Max Flow
SACO 2015 Towers (Doesn't use a Fenwick tree but has a similar idea)

Comments (24)

Write comment?

dolphingarlic

4 years ago, # |

Auto comment: topic has been updated by dolphingarlic (previous revision, new revision, compare).

→ Reply

arvindr9

← Rev. 2 →

"JOI 2015 Railroad but on a general tree instead of a line" Is this a generalization of a particular problem? Which problem is this a generalization of?

4 years ago, # ^ |

It's a generalization of JOI 2015 Railroad. The original problem requires you to find the sum of edges on a line graph, which you can do with a line sweep

Rezwan.Arefin01

He meant solve the thing asked in "Railroad" on a tree instead of a line.

Yeah. dolphingarlic I think there is a typo. It should say "JOI 2015 Railroad but on a line instead of a general tree"

SuperJ6

That's not how english works...

sahil_k

Can we modify this to work for subtree updates as well? For example, if we wanted to add $$$X$$$ to the subtree of $$$a_i$$$ in $$$O(\log N)$$$ time?

GustavoGuerra

Yes

TheScrasse

← Rev. 3 →

Can't you calculate the sum of ranges along with binary lifting?
Example: 80876049 (here the maximum is required instead of the sum, but the idea is identical)
rm[u][k] = max(rm[u][k - 1], rm[parent[u][k - 1]][k - 1]); calculates the maximum in a range of length $$$2^k$$$.

You could do that if there are no updates, but I don't think there's an easy way to handle updates without a Fenwick tree

lol right, I'm stupid

TripleM5da

-10

Not to be a pain in the ass or anything but this is the same idea as Mo on trees

+13

Yeah but I think it's kinda nice how this post talks about queries when you add / remove edges from a tree (I don't think that is mentioned in the linked post)

codebuster_10

3 years ago, # ^ |

Am I mistaken, as the approaches feels different to me as Mo on trees relies on modified dfs traversal, the other uses the normal preorder array, the MO one is more generalized also.

HynDuf

Can anybody look at my code of problem 1: Infoarena Disconnect and tell my why this code gives TLE for the last 6 test cases? Ideone.

My code is supposed to work in $$$O(n log n)$$$ as you see it. A weird thing is when I set $$$z = u$$$ and not use LCA then it runs only $$$600ms$$$ (but WA obviously).

Thanks <3.

OK I got AC :<. But by switching $$$pa[17][100001] -> pa[100001][17]$$$. Can anybody tell me why :<. It's really weird as I read somewhere someone told that $$$pa[17][100001]$$$ should be faster.

albjeno

Statements like 'this is faster than that' are usually only valid in certain contexts. Always try to remember which those are. Otherwise they are mere assumptions.

Multidimensional arrays shall always be accessed by first iterating over the right-most index. In your code, you have lots of hot loops iterating from 1 to 16 or similar. With pa[17][100001], two consecutive accesses lead to a cache miss each time since the accessed memory locations are 100001 * sizeof(int) bytes apart. With the sizes swapped, you are accessing the bytes in sequence (at least in a forward loop, backwards pre-fetching depends on the smartness of the compiler).

I am surprised, that it makes such a difference. Optimizations like these should be thoroughly benchmarked before making any claims, let alone going into production.

int a = x ^ v, b = y ^ v;

dolphingarlic or someone: What exactly is being done here (in the solution for Problem 1)? I don't quite get what the xor's are doing here. Is v necessary?

EDIT: didn't read the problem statement carefully enough, my bad

I see that USACO 2015 Max Flow is cited as an additional problem. Is it possible to do that using the technique mentioned in the post? Seems like a different kind of problem (since you're updating paths rather than points). I can think of a solution that uses HLD + Fenwick, but I'm curious to see if there is a solution that uses the technique in the post.

The idea of that problem is to count the amount each edge contributes to the answer.

If you only update tin[i] instead of both tin[i] and tout[i], you can get the subtree sum by querying the range [tin[i], tout[i]]. Consider some path A--B. If you increase tin[A] and tin[B] by 1 and decrease tin[lca(A, B)] by 2, then only nodes on the path A--B get that +1 when you query their subtree sums.

The COCI problem uses a similar idea.

(Upon writing this, I realize I should've mentioned this technique in the post as well)

Doesn't lca(A, B) get +0 in this case (if you query the subtree sum from [tin[lca(A, B)], tout[lca(A, B)]])?

← Rev. 4 →

EDIT:

I tried the following instead (as this forces +1 to be contributed to the LCA):

int l = lca(u, v);
update(tin[u], +1);
update(tin[v], +1);
update(tout[l], -1);
update(tin[l], -1);

And query the max over the sums in [tin[u], tout[u] — 1]

I was able to get AC! https://pastebin.com/X6ASEEvg

Also in case anyone is curious, here is my HLD + BIT solution (which also got AC): https://pastebin.com/duZ3cjPR

Luca

The problem Ball Machine from BOI2013 can be solved using a similar idea.

manoj_269

Nice Tutorial.Hope I have learnt something new and interesting.

dolphingarlic's blog

Introduction

Prerequisites

The idea

Problem 1 — Infoarena Disconnect

Solution

Problem 2 — JOIOC 2013 Synchronization

Solution

Conclusion

Additional problems