This problem recently appeared in the GSA-ULTRA competition (which ended last Sunday).

Can anyone propose a solution?

We have a N <= 100,000 node rooted tree.

All nodes have integer weights (**and each weight is between 0 and 1000**).

Let the height of the tree be the longest path to a leaf from the root (and the root is just defined to be node 0). The "longest" path here refers to the path with largest sum of node weights. You can change the weight of W nodes (W <= 50,000) to 0. Find the minimum height you can get if you do this optimally.

Auto comment: topic has been updated by FiveAtomTree (previous revision, new revision, compare).I think something like this might work.

Store the distance from root to vertex (i.e. sum of node weights) for each vertex. Change the weight of the first nonzero edge on the longest path to zero. You update the distance values using a HLD+segtree. The segment tree will have a subarray add update and a max query to find the leaf with the max distance. You'd also need to store on each HLD chain the index of the bottom vertex of first nonzero edge.

Complexity: $$$O(n \log^2 n)$$$.

Can you explain how this algorithm works in the following example?

Three nodes: 0, 1, 2 (where 0 is the root)

Weights: 0, 5, 10 (respectively)

Edge from 0->1 and edge from 1->2. You can change the weight of exactly 1 node to 0 (so W = 1).

In this case, it's optimal to change the weight of node 2 to 0 (in which case you'd get a height of 5).

How does your algorithm achieve this?

From the seg tree, find the leaf $$$L$$$ with max distance, which is $$$2$$$.

Then using HLD find the maximum weighted node $$$V$$$ in the path from $$$0$$$ to $$$L$$$, which is again $$$2$$$.

Then subtract the weight of $$$V$$$ from the distances of all leaves in the subtree of $$$V$$$ using seg tree. In this case only the weight of $$$2$$$ will become $$$0$$$.

Also update the weight of $$$V$$$ in HLD chain.

Then find the leaf with minimum distance using seg tree, which is $$$1$$$ in this case.

Isn't this a different algorithm than the one ramchandra proposed? You take the maximum weighted node, while he takes the first non-zero one.

Though this new algorithm likely still has issues, because removing the maximum node along the path at every iteration isn't optimal (since removing an ancestor of the node that is lower weight could be better as it would affect more leaf nodes).

I just noticed this flaw. Maybe that's the reason he chose the first non-zero edge?

But I can't prove that as an optimal choice either.

The case I gave in response to his post should invalidate that greedy algorithm, because it would pick the first non-zero weight (i.e. 5) instead of the better weight of 10.

But, perhaps I'm misunderstanding his algorithm, not sure.

Can you prove this greedy algorithm?

We can have two multisets A,B... A denotes things to be reset and B denotes normal edges... If we encounter weight Val and size of A is less than W, then we add val in A otherwise we check it with smallest element of A, if val<smallest element of A then add it to B otherwise Add it to A and add smallest element of A to B we can maintain the current Value of Path while doing so... and while returning back we just need to reverse what we have done, like we usually do. Complexity is like O(nlogn) Like just think how we would have solved the problem in O(N) if there were no special moves of resetting values.

Can you prove this greedy algorithm?

Most people passed the problem with the following DP:

In each set you maintain a dp[u][k] = min possible height if we remove k edges in the subtree of u. Clearly the dp is quadratic.

To make it "fast enough", you notice that the merge of the children dp-s is equivalent to simply getting their lists, concatenating them and finally sorting in decreasing order. Then we are left with the cost of the current node, hence we do dp[u][k] = min(dp[u][k — 1], dp[u][k] + cost).

This way the answer will be in dp[root][k].

I had a solution with a treap + small to large trick, but it was nowhere as fast as the quadratic one. Also I think you can binary search the answer and then calculate the same dp as above, but just keep a couple of variables instead of the lists (as you only care whether the dp-s are < mid or not).

Yeah I used this quadratic solution too.

What was the complexity of your treap solution?