Hi, I want to share a very interesting proof for the existence of centroid.

Let's direct each edge to the side that has more vertices. (direct randomly if equal) Then you have n vertices and n-1 edges, therefore at least one vertex has an outgoing degree of 0. (If for every vertex the outgoing degree was at least one, we needed at least n edges) This vertex is your centroid. (Because all edges are directed towards it, so no child has more than n/2 vertices).

Using this approach, not only you prove centroid exists, but you can also find it.

I explained it in detail and with visuals in my video on centroid #2 here: https://www.youtube.com/watch?v=t74JJlMIQv8

Recently many people wanted me to talk more about more advanced topics. So I thought maybe I should share this here. (not many people had watched the video)

If you like such facts, let me know. I will make more videos about them.

Let me know about your opinions.

It'd be nice to see you starting to teach more advanced content

maybe graphs/trees, cp related number theory, etc

I guess this algorithm can't be very practical, as we need to know the subtree size to direct the edges which kinda defeats the point.

Still an interesting take!

Yeah, it is not practical and I don’t use it myself. But very interesting.

You could even say that an ‘optimized’ version of such an algorithm would just mark all vertices which have subtree size less than $$$n/2$$$ or have a child with size less than $$$n/2$$$, and pick the unmarked vertex as the centroid. This looks familiar…

Yes, exactly.

Well known on Earth, welcome.

Idk how other people find centroids, but I was taught to find subtree sizes, start from the root, and then go down while there is some child of size $$$> n/2$$$. This is a good way, but it is a bit bug-inviting (for example, you need to remember not to go to your parent, and in general these while loops with walks are error-prone). I recently figured out a nice alternative. We still find the sizes of subtrees, and then take a vertex $$$v$$$ with the smallest size of the subtree $$$> n / 2$$$. As $$$v$$$ subtree has size $$$> n / 2$$$, its parent subtree is small, and as we did not choose any of $$$v$$$'s children instead of $$$v$$$, it means that they are also small. For me personally, this seems to be a much nicer way. The only downside of this idea I found is that one needs to know all the vertices in the current tree to go through them, which can be not obvious in centroid decomposition.

~~Btw you can expand this idea to whole centroid decomposition. Will be single dfs calculating sizes and on particular cases $$$O(n)$$$ time complexity.~~btw shy guys, you can write something not just downvoting

Yes, this seems nice. About the downside of this approach, if you just pass the size of the subtree you are working on, then you can calculate everything inside your dfs. Right? (Just after you figured out the size of the subtree, if it was greater than or equal to n/2 you compare it with your current result)

And I think you can simply pass the size of the current subtree while implementing centroid decomposition because you know the sizes of all the subtrees.

It is gonna be nice, because you are gonna find a centroid in one single dfs. (You don't need the second one to move downward until you find the centroid)

implemented here https://github.com/yosupo06/library-checker-problems/blob/master/graph/vertex_add_range_contour_sum_on_tree/sol/correct.cpp#L118-L132

Oh, awesome.

Personally, I usually implement centroid decomposition by first running a dfs through the tree to find the subtree size for one root. Then, I start from the root. If I find a subtree that has a size strictly larger than sz[root]/2 (where sz[root] is the subtree size of the root), I will immediately go to that subtree and call the same recursive function. Or else the node we're currently standing is the centroid. Maintaining the subtree size array should be pretty straightforward

Proof for the code complexityFor convenience, I will call s(u, v) the size of the subtree of v, when I root the tree in u. Also, I'll call n the number of nodes in the tree

Assume that the code runs forever, because a tree has no cycle, there must exist 2 nodes u and v such as u is directly connected to v, and s(u, v) and s(v, u) are both larger than n/2. Then, we have s(u, v) + s(u, v) > n (contradiction, because it should be equal to n)

So the recursion function never revisits a node => the worst-case complexity is O(n)

Assume that I found a node that is the centroid, then I just set the subtree size of that node to 0 (so that there is no way that node will be revisited anymore) and then call the recursive function for all the adjacent nodes that haven't been register in the centroid tree

My implementation looks something like this

SpoilerThe depth variable is necessary if you need to access the depth of a node in the centroid tree in the future, which it is, in the problem I was trying to solve. Also, the "cenpar" variable is saving the parent of the node in the centroid tree

I think my implementation works well for some complex problems, you just need to change a little bit if you need to access additional information about the place of the node in the centroid tree

My code for a centroid decomposition problemE. Freezing with Style

Sorry for a not-very-clean code :)))

If you have some way to improve my implementation, please let me know.

(P/s: Sorry for my bad English and markdown skills)

Yes, that works. That is the basic and the most common way of implementing centroid decomposition.

Here, we are talking about interesting alternatives. But thanks for mentioning, awesome explanation!