### PeruvianCartel's blog

By PeruvianCartel, 4 months ago,

Hi guys. So, Centroid Decomposition, I never really paid attention to its runtime. That is until I was doing a problem (in an OI contest) that goes kinda like this:

Statement: Given a tree of $N$ nodes ($N \leq 4*10^5$) that are not colored. There are two types of queries:
- Color a node.
- Among the nodes that are colored, print the furthest distance between two colored node.
Time limit: 1 second.

This problem looks like the classic and beloved E — Xenia and Tree. Therefore, I just used Centroid Decomposition without giving it much thought. And to my surprise, I got FST because my Centroid Decomposition code was too slow. And I was really confused. You know, an $O(N * log N)$ algorithm getting TLE with $N = 10^6$ sounds unlikely already, let alone $N = 4*10^5$. Not only that, it took twice as long as I would have liked.

So... Why is that? Why is centroid decomposition so slow? I mean, it's literally find centroid of a tree, remove that node and do the same for all of the subtrees, ooga booga and you're done, so I really have no clue what is holding it back, and how do you optimize Centroid Decomposition (aka what is the best way to implement Centroid Decomposition for speed without touching black magic such as SIMD and pragma)? Here is my Centroid Decomposition implementation: 237313601

• +3

By PeruvianCartel, history, 4 months ago,

Hi guys. I was upsolving OI problems, when I stumbled upon this:

Statement: There are $n$ soldiers standing in a line, whose heights are distinct. A dude can be seen from the left side if there are no other dude to the left of him that is taller than him, and the same goes for the right side. Given $n$, $p$, $q$ ($n \leq 2000$), count how many way to arrange these dudes such that there are exactly $p$ dudes that can be seen from the left side, and $q$ dudes that can be seen from the right side. TL = $1$ second on a judge that runs as fast as Codeforces.

My idea: First we solve the case $q = 1$. For the sake of simplicity, we will assume the heights of these soldiers as a permutations of integers from $1$ to $n$. We can use dp: $f[i][j][k] =$ number of way to assign soldiers for the first $i$ position, such that the tallest soldiers that we assigned has the height $j$, and you can see $k$ dudes from the left side. From there it's just simple dp. For the general case, the answer is simply $\sum_{t = 1}^n (f[t-1][t-1][p-1] * f[n-t][n-t][q - 1] * C_{n-1} ^ {i-1})$. Sadly, this runs in $O(n^3)$, since our dp table has $n^3$ states.

I've discussed this problem with another dude who ranked Master on Codeforces, and we are proud to say that we can not cook, so any help would be appreciated. Thanks!

• +9

By PeruvianCartel, history, 4 months ago,

Hi guys. When I was doing a problem that basically requires a reasonably fast maximum flow template, one thought came across my mind: Is it possible to hack Ford-Fulkerson(DFS) + blocking flow + randomization? I know that using BFS to run Ford-Fulkerson is much safer, as there is a test generator (It's for MCMF, which is literally DFS Ford-Fulkerson) that make the basic DFS Ford-Fulkerson runs in $F$ iterations, resulting in the complexity of $O(EF)$ ($E$ is number of edges, $F$ is the maximum flow of the network).

However, I just really want to use DFS Ford-Fulkerson, since you can do blocking flow on it, which grants some pretty bullsh*t performance boost on average (and it's trivial to implement). So... is there any close upper-bound on the complexity of DFS Ford-Fulkerson given these optimization? And if my optimization sucked, is there a test generator to blow it up? (Obviously I can get godlike luck and find the maximum flow on the first iteration. Conversely, RNGesus could just walked my algorithm right into the worst case, so we are only counting the average case here)
• +17

By PeruvianCartel, history, 4 months ago,

Hi guys, in this problem: CSES: Maximum Building II, the constraint for $n$, $m$ was $1000$, which sounds fishy. Therefore, I tried the trivial $O(n^3)$ brute, and surprisingly it AC with the slowest testcase being 0.9s (I guess it's because the operations was so simple and predictable that the compiler and CPU just do some hocus pocus and speed it up like 10 times).
So... why did the author decide to set the constraint that low? You can solve that problem in $O(n^2)$, with good constant, so it's not like you have to be extremely generous when it comes to time limit to let these solutions pass, so what is the motive behind it? Was the $O(n^3)$ brute the intended algorithm? (which doesn't really make sense, since it's supposed to be a hard problem????)

• +20