Blogewoosh #7

Правка en1, от Radewoosh, 2019-05-28 01:41:58

Hello, codeforces!

Long time no see, right? So maybe it's a good idea to try to return to my blogs slowly. This time the blog will be about a trick, which usually isn't necessary to solve a task, but can be useful to make implementation much more comfortable.

Let's look at this problem. It is about some DP on a tree in which we have to use convex hull trick to improve the complexity. The task requires merging two convex hulls with "smaller to bigger" trick. I recommend you to read the statement before reading the rest of the blog (and the editorial if you don't know how to solve it). 

So, how to merge two convex hulls? If we keep them in something like the set from C++, then we can just take every element from the smaller one and try to add it to the bigger one. Unfortunately, adding one element in this way requires some ifology and care. It's hard to implement it, especially on individual contests where you don't have a reference document and cannot use the internet (like IOI-style contests).

We know that adding one element is hard, right? So, what is an easier way to get the new convex hull? Definitely, the easiest way is to collect all the points, sort them, and calculate the convex hull with the standard algorithm.

But we still have to care about the complexity. So, here comes mnbvmar with his great approach. I'm very grateful for sharing it with me. Instead of one convex hull in each node, let's split points from that node into $\log(n)$ groups and maintain the convex hull of each of them! How to split them? I'll tell about it in a moment. Why will it work? In this task, each node corresponds to a single point on the plane (or a linear function if you think about lines and not about points, but I'm going to stay with points). We need to answer multiple queries where we're given a vector and a subtree, and we need to minimize the dot product of the vector and some point corresponding to one of the nodes in this subtree. Normally, it would be done using a binary/ternary search on a convex hull, so we can simply do it on each of the hulls separately and return the minimum of the results.

For example, if we have to keep a convex hull of, let's say, $11$ points, let's split them in any way into groups of $1$, $2$ and $8$ points (yep, it's the binary representation of $11$). Now, let's say that we want to merge convex hulls, where one of them keeps $11$ ($1+2+8$) points and the second one keeps $36$ ($4+32$) points. The sizes of all groups are different, so we can straightforwardly make a new structure which consists of $5$ convex hulls of sizes $1$, $2$, $4$, $8$ and $32$.

But, what if we want to merge convex hulls of $35$ ($1+2+32$) and $38$ ($2+4+32$) points? If we allowed having two groups of equal sizes, we could quickly end with so many groups of this same size. As we still care about the complexity and want to have very few convex hulls in each node (node of the tree, recalling to the problem statement), we should change the sizes in some way. We have two groups of size $2$, so maybe let's try to combine them into one group of size $4$? By combining, I mean using the standard convex hull algorithm on points from these two hulls to make one hull. Now we have groups of sizes $1$, $4$, $4$, $32$, $32$. Let's now combine groups of size $4$. Poof, $1$, $8$, $32$, $32$. Now it's time to merge the groups of size $32$. The sizes are $1$, $8$, and $64$ now, which already looks nice. We had to do some recalculations, but maybe the situation isn't so bad. Of course, the complexity of calculating new convex hull is linear (or has an additional logarithm if you want to sort the points). So each time when we touch a point (we recalculate a convex hull containing it) it adds $O(1)$ (or $O(\log(n)$) if you're sorting points) to complexity. How many times can we touch a point? Whenever we do it, the size of its convex hull doubles, so we can do it only $O(\log(n))$ times!

You might be confused with these sizes because some points become useless as soon as they stop belonging to a convex hull. So yes, it's true that these sizes might not be exact powers of two, but they could be less. It isn't important. We can remember their original sizes and assume that these useless points are still there.

The complexity of recalculations is $O(n \cdot \log(n))$ (or $O(n \cdot \log^2(n))$) in total. On the other side, we have to do a binary/ternary search $O(\log(n))$ times in each node, instead of doing only one, so the final complexity will be $O(n \cdot \log^2(n))$ in overall, which makes this solution as fast as the model solution.

Nice, isn't it? Vectors instead of sets (using C++ slang), easier code, fewer places for off-by-one errors. "But Radewoosh, this problem is still solvable in a usual way, the trick is almost useless". Yep, but the convex hulls are only an example. Let's imagine the following task (which was on some Open Cup a long time ago if I'm not mistaken, but I'm unable to find it). We have to keep a multiset of strings, which is initially empty and perform a sequence of operations. In each of them, a new string comes, let's call it $s$. We have to calculate the total number of occurrences of the strings from the set as substrings of $s$. If some string from the set has multiple occurrences (it matches with substrings of $s$ in many positions), we have to count each of them separately. We should then print the calculated number and add $s$ to the set. We should aim for $O(m \cdot \log(m))$, where $m$ is the size of the input. A standard problem with strings.

Now the hard part: we should process queries online.

Without the online assumption, we could build an Aho-Corasick on all the words from the input and use some segment/Fenwick tree to make the new strings "active". The task is kind of well-known, so I won't describe how to exactly solve it. But we need to process queries online, and it's hard to update the Aho-Corasick tree and add new strings to it. The only easy thing is calculating a new one, with a new set of strings. Any ideas? Yes! Let's keep $\log(number\ of\ strings)$ tries instead of one and recalculate them whenever you need to do it. Also, we don't need any segment trees. Everything that we want is a sum of some values in the subtrees, so we can use a simple DP to calculate them.

That's all. I hope that you found this blog useful and that the trick will save some of you somewhen. I also hope that the blogs will again appear more often. If you have any questions or you find more applications for this trick, write about it in the comments. See you next time! :D #### История

Правки Rev. Язык Кто Когда Δ Комментарий
en1 Radewoosh 2019-05-28 01:41:58 6819 Initial revision (published)