### maomao90's blog

By maomao90, history, 9 months ago,

Recently when I was doing Universal Cup Round 5, I got stuck a tree problem A as I realized that my solution required way too much memory. However, after the contest, I realized that there was a way that I can reduce a lot of memory using HLD. So here I am with my idea...

# Structure of Tree DP

Most tree DP problems follow the following structure.

struct S {
// return value of DP
};
S init(int u) {
// initialise the base state of dp[u]
}
S merge(S left, S right) {
// returns the new dp state where old state is left and transition using right
}
S dp(int u, int p) {
S res = init(u);
for (int v : adj[u]) {
if (v == p) continue;
res = merge(res, dp(v, u));
}
return res;
}
int main() {
dp(1, -1);
}

An example of a tree DP using this structure is maximum independent set (MIS).

Code

Suppose struct $S$ requires $|S|$ bytes and our tree have N vertices. Then this naive implementation of tree DP requires $O(N\cdot |S|)$ memory as res of the parent is stored in the recursion stack as we recurse down to the leaves. This is fine for many problems as most of the time, $|S| = O(1)$, however in the case of the above question, $|S| = 25^2\cdot 24$ bytes and $N = 10^5$, which will require around $1.5$ gigabytes of memory, which is too much to pass the memory limit of $256$ megabytes.

# Optimization

We try to make use of the idea of HLD and visit the visit the vertex with the largest subtree size first.

struct S {
// return value of DP
};
S init(int u) {
// initialise the base state of dp[u]
}
S merge(S left, S right) {
// returns the new dp state where old state is left and transition using right
}
int sub[MAXN];
void getSub(int u, int p) {
sub[u] = 1;
pair<int, int> heavy = {-1, -1};
for (int i = 0; i < adj[u].size(); i++) {
if (v == p) continue;
getSub(v, u);
sub[u] += sub[v];
heavy = max(heavy, {sub[v], i});
}
// make the vertex with the largest subtree size the first
if (heavy.first != -1) {
}
}
S dp(int u, int p) {
// do not initialize yet
S res;
bool hasInit = false;
for (int v : adj[u]) {
if (v == p) continue;
S tmp = dp(v, u);
if (!hasInit) {
res = init();
hasInit = true;
}
res = merge(res, tmp);
}
if (!hasInit) {
res = init();
hasInit = true;
}
return res;
}
int main() {
getSub(1, -1);
dp(1, -1);
}

If we analyze the memory complexity properly, we will realize that it becomes $O(N + |S|\log N)$. The $O(N)$ comes from storing the subtree sizes, and the $O(|S|\log N)$ comes from the DP itself.

## Proof

The two main changes from our naive DP is that we initialize our res only after we visit the first child, and we visit the child with the largest subtree size first. Recalling the definitions used in HLD, we define an edge $p_u \rightarrow u$ to be a heavy edge if $u$ is the child with the largest subtree size among all children of $p_u$. We will call all other non-heavy edges as light edges. It is well known that the path from the root to any vertex of a tree will involve traversing many heavy edges but at most $O(\log N)$ light edges.

Making use of this idea, if we visit heavy edges first, res of the parent of heavy edges will not be stored in the recursion stack since we only initialize our res after visiting the first edge, hence only res of the parent of light edges will be stored in the recursion stack. Then since there is at most $O(\log N)$ light edges, the memory complexity is $O(|S|\log N)$.

My solution to the above problem

# Conclusion

I have not seen this idea anywhere before and only came up with it recently during Universal Cup. Please tell me if it is actually a well-known technique or there are some flaws in my analysis. Hope that this will be helpful to everyone!

• +270

 » 9 months ago, # |   +10 Auto comment: topic has been updated by maomao90 (previous revision, new revision, compare).
 » 9 months ago, # |   +51 This is one of the most beautiful techniques I've ever seen, hands down. I would have never thought you could do something like this.
 » 9 months ago, # |   +16 Nice! But shouldn't we firstly go to the child, and only then initialise our base state? auto save = dp(v, u); if (!hasInit) { res = init(); hasInit = true; } res = merge(res, save);
•  » » 9 months ago, # ^ |   +10 Oops yes sorry my mistake
 » 9 months ago, # | ← Rev. 2 →   +12 This technique is also mentioned in the editorial of 103119I - Nim Cheater, and it's the intended solution for this problem.Also, I believe this should be the correct source code. S dp(int u, int p) { // do not initialize yet S res; bool hasInit = false; for (int v : adj[u]) { if (v == p) continue; if (!hasInit) { res = dp(v, u)); res = merge(init(),res); hasInit = true; continue; } res = merge(res, dp(v, u)); } if (!hasInit) { res = init(); hasInit = true; } return res; } If it's bamboo, you will call init() for all nodes before returning from the leaf; therefore, it's O(n) memory only instead of O(logN) we are targeting.
•  » » 9 months ago, # ^ |   0 Oh wow I see thanks!
 » 9 months ago, # |   +11 Is trivial == well-known?
 » 9 months ago, # |   +13 Thanks a lot for sharing this with us. At first, it might look like it doesn't have much of a use case. However, it is actually helpful. For example, there was a tree DP problem in my second NOI (National OI). It was quite simple, but it required $O(nk)$ memory where $n \leq 10^5$, $k \leq 200$, and the memory limit was 32 MB, which is why the main problem was reducing memory usage. Even though other people managed to squeeze in wrong solutions because of bad test cases, I tried many different methods, but I couldn't get AC. If I knew this back then, I would've solved it easily and won a gold medal.
•  » » 9 months ago, # ^ | ← Rev. 2 →   -29 okaragul not giving a fuck and inventing this in like 30 seconds (In my defense I was the one lucky enough to teach him HLD in the first place, so I'm taking full credit)
 » 9 months ago, # |   +3 Always wrote tree dp using hld, but never thought it actually optimized something
 » 9 months ago, # | ← Rev. 2 →   +3 A maybe related blog post (as for solving problem A in ucup): https://codeforces.com/blog/entry/67001In some sense, centroid decomposition can also help.
 » 9 months ago, # |   +18 Why we are calling getSub() for the root only?
•  » » 9 months ago, # ^ | ← Rev. 2 →   0 Oops i forgot to call the function inside the for loop. It was meant to be a dfs by the way
 » 9 months ago, # |   +24 I think it's more-or-less intended solution of ICPC WF 2020 problem B
 » 9 months ago, # |   0 Isn't this trick was in IOI 2009 — Regions? Also as Alperent said in our national OI.
 » 9 months ago, # |   +16 I think your getSub function doesn't really calculate subtree sizes :)
•  » » 9 months ago, # ^ |   +11 Bruh I can't code DFS 🤡
 » 5 months ago, # |   0 I tried to solve the problem using choose(25,2) DFSs, and instead of the memory problem you faced, I got stuck not being able to make the code fast enough to pass the time limit. One trick I come up with is to re-index the nodes according to the pre-order of their tree. I noticed that my code is doing many parent-child calculations and if the indexes are not randomly assigned, some time can be saved.TLE codeAC codein short, i added the following: int tin[NMAX], tim; void dfs(int u, int p){ tin[u] = tim++; for(auto v : adj[u]) if(v != p){ dfs(v, u); } } { dfs(0, -1); vi nvec(n); vector nadj(n); for(int i = 0; i < n; i++){ nvec[tin[i]] = vec[i]; } for(int i = 0; i < n; i++){ for(auto v : adj[i]){ nadj[tin[i]].push_back(tin[v]); } } vec.swap(nvec); adj.swap(nadj); }