№	Пользователь	Рейтинг
1	tourist	3845
2	jiangly	3707
3	Benq	3630
4	orzdevinwang	3573
5	Geothermal	3569
5	cnnfls_csy	3569
7	jqdai0815	3532
8	ecnerwala	3501
9	gyh20	3447
10	Rebelz	3409

№	Пользователь	Вклад
1	maomao90	171
2	adamant	163
3	awoo	162
4	maroonrk	152
4	nor	152
6	-is-this-fft-	151
7	TheScrasse	148
8	atcoder_official	146
9	Petr	145
10	pajenegod	144

Introduction

Hello Codeforces, I came across this problem a while back — https://cses.fi/problemset/task/1139/. Here the task is — "Given a colored, rooted tree, determine the number of distinct colors in each subtree of the tree". It is possible to solve this problem by flattening the tree and then using a binary indexed tree(now that subtrees have converted into subarrays). I will discuss an alternate technique to solve this which can help in cases where it is useful to have the entire subtree as a set in a dfs call.

$$$O(n^2)$$$ solution

If $$$n$$$ were small, we could have stored for each node, the set of elements in its subtree. A DFS can be used to evaluate this which will take $$$O(n^2log n)$$$ time and $$$O(n^2)$$$ space.

vector<int> color(n);
vector<set<int>> subtree(n);

void dfs(int i,int parent){
	subtree[i].insert(color[i]);
	for(int node : graph[i]){
		if(node != parent){
			dfs(node,i);
			subtree[i].insert(subtree[node].begin(),subtree[node].end());
		}
	}
}

The number of distinct colors in the subtree of the $$$i^{\text{th}}$$$ node is just the size of the set corresponding to it.

An Improvement

The problem with this technique is repetition. Each element occurs in $$$depth[n]$$$ different sets which is inefficient. For an improvement, we can consider the same idea used in heavy-light decomposition and union-find optimizations, which is "small-to-large merging".

We will use a DFS as before. Now to construct the set of all elements in the node's subtree, we will pick the child with the largest subtree set. Now instead of making a new set for the current node, we will use the same set as the child with the largest subtree and insert all other children's set in it. To avoid consuming extra space and time, we will use pointers to accomplish this task.

vector<int> color,distinct;
vector<set<int>*> subtree;
 
void dfs(int i,int parent = -1){
	int largest = -1;
	vector<int> children;
	for(int node : graph[i]){
		if(node != parent){
			dfs(node,i);
			children.push_back(node);
			if(largest == -1 || subtree[largest]->size() < subtree[node]->size()){
				largest = node;
			}
		}
	}
	if(largest == -1){
		subtree[i] = new set<int>; // new set for leaf node
	}
	else{
		subtree[i] = subtree[largest]; // largest sized child
	}
	
	for(int child : children){
		if(child == largest)continue;
		subtree[i]->insert(subtree[child]->begin(),subtree[child]->end());
	}
	subtree[i]->insert(color[i]);
	distinct[i] = subtree[i]->size();
}

Time complexity

Every time we copy an element over, the set its now in will be at least 2 times larger than the set it was previously in. Hence the time complexity is $$$O(n log^2 n)$$$ (or $$$O(n log n)$$$ if we avoid using sets).

Thanks to -is-this-fft- for this

Conclusion

Here is my solution for the CSES task — https://cses.fi/paste/e75f5efc2e1cd2fc3c8a66/

This technique can be used in various other tree problems. Examples of such problems could be to find the most frequent element in each node's subtree or to find the pair with minimum difference in each node's subtree.

Rev.	Кто	Когда	Δ	Комментарий
en7	shadow9236	2022-05-21 09:59:32	0	(published)
en6	shadow9236	2022-05-21 09:58:27	6
en5	shadow9236	2022-05-21 09:57:01	43	Grammar
en4	shadow9236	2022-05-21 09:52:36	44
en3	shadow9236	2022-05-21 09:49:39	2815	Tiny change: 'Thanks to @-is-this-f' -> 'Thanks to /profile/-is-this-f'
en2	shadow9236	2022-05-21 09:17:42	529
en1	shadow9236	2022-05-21 09:10:06	272	Initial revision (saved to drafts)

Rev.

Язык

Кто

Когда

Комментарий

en7