### Bashca's blog

By Bashca, history, 10 months ago,

Hello, Codeforces community!

In this opportunity I wanted to show you a trick that I discovered a some time ago. The basic idea is to generalize the Fenwick Tree in order to make it work on trees.

I am very grateful to racsosabe, for helping me in the revision and correction of this blog.

A few days ago I read this dolphingarlic's blog, and I found it pertinent to talk about this trick. I've solved some problems using this idea but I don't know if it is not known or I just didn't know the name. I'll start by making some proofs and checking a few examples.

### Definition 1 (Centroid):

A centroid of a tree is a node $x$ that minimizes the maximum size among all the trees in the forest generated by eliminating $x$.

A significant theorem tells us that a centroid is that node that, after its elimination, makes the maximum tree size less than or equal to $\frac{n}{2}$. We can find some simple proofs here, and a way to find it here.

## Centroid Tree

Centroid Tree is the name given to the tree found when computing the well-known centroid decomposition.

There are many useful tutorials on how to build and use the Centroid Tree. Some really cool tutorials: 1, 2, 3, 4.

### Lemma 2:

• The height of a Centroid Tree is at most $\log_{2}n + 1$.

Remark: Note that each tree in the forest when removing a centroid has at most size $\frac{n}{2}$.

### Lemma 3:

• Given two nodes $u$ and $v$, the LCA of $u$ and $v$ in the Centroid Tree belongs to the path between $u$ and $v$ in the original tree.

Remark: Note that a centroid separates all the paths between nodes in different subtrees, and when we choose the LCA between $u$ and $v$ as centroid, it is the first time that $u$ and $v$ are in different subtrees.

### Corollary 4:

• Let $x$ be an arbitrary node, the $n$ paths from $x$ to any node are encoded in at most $\log_{2}n + 1$ nodes.

Lemma 1 and 2 tells us that the LCAs of a node $x$ in the Centroid Tree are part of every path that begins at $x$. Furthermore, in the remark of lemma 2 we can see that the LCA of $u$ and $v$ is the only common ancestor of both nodes in the Centroid Tree that belongs to the path between $u$ and $v$.

### Basic Implementation

const int maxn = 1e5 + 10;
vector<int> g[maxn];
int sz[maxn], pi[maxn];
bool block[maxn];
int n;

int centroid(int v, int p, int n) {
sz[v] = 1;
int mx=0, cen=0;
for (auto u : g[v]) if (u!=p && !block[u]) {
cen ^= centroid(u, v, n);
sz[v] += sz[u], mx=max(mx, sz[u]);
}
mx = max(mx, n-sz[v]);
if (2*mx <= n) pi[cen=v]=p;
return cen;
}

void decompose(int x, int p=-1, int m=n) {
int cen = centroid(x, -1, m);
if (~pi[cen]) sz[pi[cen]]=m-sz[cen];
pi[cen]=p; block[cen]=1;
for (auto v : g[cen]) if (!block[v]) {
decompose(v, cen, sz[v]);
}
}


## Descendant-Ancestor Structure

### Definition 5 (Centroid Path):

We define the centroid path of a node as to the set $C(x)$ of ancestors of $x$ (including itself) in the centroid tree.

### Definition 6 (Subtree Set):

We define the Subtree Set with respect to $r$ of a node $x$ as to the set $S_{r} (x)$ of descendants of $x$ when $r$ is root.

### Definition 7 (Ancestors Set):

We define the Ancestor Set with respect to $r$ of a node $x$ as to the set $A_{r} (x)$, of nodes on the path from $r$ to $x$, including itself.

### Lemma 8:

• Given a node $x$, the set $S_{r} (x) \cap C (x)$ encodes all paths between $x$ and $S_{r} (x)$.

Remark: Note that every node on the path from $x$ to its descendants is a descendant of $x$. Therefore, the centroids that encode the paths are also descendants.

### Lemma 9:

• Given a node $x$, the set $A_{r} (x) \cap C (x)$ encodes all paths between $x$ and $A_{r} (x)$.

Remark: Analogous to the lemma above.

### Corollary 10:

• Given two nodes $u$, $v$ and $D(u, v) = (S_{r}(u) \cap C(u)) \cap (A_{r}(v) \cap C(v))$, then, $\vert D(u, v) \vert \le 1$.

proof: If $v$ is not a descendant of $u$, then the cardinality is $0$, otherwise, there is a unique common ancestor that is necessarily a descendant of $u$ and ancestor of $v$.

It should be noted that the last corollary tells us that we can exchange information without repetition between the descendants and ancestors through the nodes in the centroid path.

### Is in subtree query

The method to check that a node is in the subtree of another one is well known, you can see the following comment.

Basic implementation:

int T = 0;
void dfs(int x, int p) {
lo[x] = T++;
for (int v : g[x]) if (v^p) {
dfs(v, x);
}
hi[x] = T++;
}

//check if y is in the subtree of x
bool is_in_subtree(int x, int y) {
return lo[x] <= lo[y] && hi[y] <= hi[x];
}


### How to update and query values to a node?

int ft[maxn];
void update(int x, int add) {
for (int ll=lo[x], rr=hi[x];~x; x=pi[x]) //for to iterate over the "Centroid Path"
if (ll <= lo[x] && hi[x] <= rr) { //update descendants
}
}

int query(int x) {
int ans=0;
for (int ll=lo[x], rr=hi[x]; ~x; x=pi[x])
if (lo[x] <= ll && rr <= hi[x]) { //consult ancestors
ans += ft[x];
}
return ans;
}


## Exercises

### Infoarena — Disconnect

Statement: You have a tree and two types of operations:

• Remove an edge.
• Query if two nodes are connected.

We can see this problem as updating values in subtrees: In the beginning, all the values start with $0$ and every time we update increment by $1$ all the nodes of the subtree, to answer we must know $value(a) + value(b) - 2 * value(LCA(a, b))$ and check if it's equal to $0$.

How does this work? We mantain a counter of deleted edges for each node. This counter will store the number of deleted edges from the root of the tree to the given node. Thus, adding $1$ to all the nodes in the subtree of $x$ will be equivalent to deleting the edge between $x$ and its parent.

code spoiler

### Codeforces — Propagation Tree

Statement: You have a tree and have two types of operations:

• Given $val$ and $x$, update the nodes with the following rule: ${-1}^{dist(x, u)} val$, for every node $u$ in the subtree of $x$.

• Query the current value of a node.

Just consider the distance to the intermediate centroid which we are going to update to calculate the value.

code spoiler

### GP of Nanjing: Data Structure

Statement: Given $T$ cases, you have a tree with $n$ nodes rooted in node $1$, initially, each node has weight $0$ and you need to make $m$ queries of two possible types:

• Given $a$, $x$, $y$, $z$, add $z$ to the weights of all descendants $v$ of $a$ such that $d(a, v) \mod x$ is $y$.

• Given a node $a$, return its weight.

The constraints are:

• $1 \le T \le 4$
• $1 \le n, m \le 3\cdot 10^{5}$
• $1 \le a, x \le n$
• $0 \le y < x$, $0 \le z \le 500$.

Solution: We can maintain for each node $v$ an array of length equal to the maximum distance from $v$ to the nodes in the tree whose centroid equal to $v$, where we save the updates to be queried by their descendants, this would give us a performance per query equal to $O(\log (n))$ (traversing the $O(\log{n}$) ancestors and getting their propagated value in $O(1)$), and a performance per update equal to $O(\frac{n}{x})$, notice that it is not $O(\frac{n}{x} \log n)$ because can be seen as the sum:

$\sum\limits_{k=0}^{\log(n) - 1} \frac{2^i}{x} = O\left(\frac{n}{x}\right)$

As this is not optimal for updates, we can save for each node $v$ a matrix of size $t(v) \times t(v)$, where we would store for $x \le t(v)$, the accumulated queries that have come, this allows us to make queries in $O(\log{n} + t(v) + t(\pi(v)) + \ldots + t(root))$ and updates in $O\left(\log{n} + \frac{len(v)}{t(v)} + \frac{len(\pi(v))}{t(\pi(v))} + \cdots + \frac{n}{t(root)}\right)$

Note: Recall that $\pi(u)$ is the parent of the node $u$ and $len(v)$ is the maximum distance from $v$ to any of its descendants.

We can see these sums in another way, since the parent centroids have at least twice as many nodes as their corresponding child centroid:

$T(Update) = O(\log{n}) + \sum\limits_{i=0}^{floor(\log (n) - 1)} \frac{2^i}{t_{i}}$

,

For queries:

$T(Query) = O(\log{n}) + \sum\limits_{i=0}^{floor(\log (n) - 1)} t_{i}$

We can ignore the $O(\log{n})$ when assigning $t_{i}$ because it's not affected by it, so if we set $t_{i} = \frac{2^{i}}{t_{i}}$, which would be $t_{i} = \sqrt{len(v_{i})}$, which would give us a complexity.

$\sum\limits_{i=0}^{floor(\log (n) - 1)} \sqrt{2}^i$
$\frac{\sqrt{2}^{\log{n}} - 1}{\sqrt{2} - 1} = O(\sqrt{n})$

therefore our final complexity would be $O(m \sqrt{n} + n \log{n})$ per test case.

code spoiler

I hope this technique will help you, I am attentive to recommendations for problems, doubts and/or errors in my analysis.

Notice that for a path with quantity of elements equal to a power of $2$, the structure represents a Fenwick Tree.