### egor.okhterov's blog

By egor.okhterov, history, 5 years ago,

We have a tree with n (2 ≤ n ≤ 1000) nodes and we need to find k (1 ≤ k ≤ 100 and k ≤ n - 1) vertices such that the path going through all of these vertices has the maximum length.

Note:

• The path should start at vertex 1 and finish also at vertex 1.
• The algorithm has to return just the length of the maximum path (the path itself is not needed).
• The weight for an edge has following restrictions: 0 ≤ w ≤ 100000.

Let's say we have the following tree with n = 5 vertices:

We need to find k = 3 vertices which will give us the longest path.

For this tree the maximal path is the following:
15241

It has the total length of 13 + 18 + 10 + 5 = 46.
So, for that particular tree we have to print 46 as our result.

I came up with a following greedy/dp-like solution. First we solve the problem for k = 1 and we remember this solution in a linked list 1v1. After that we try to solve the problem for k = 2 by trying all of the remaining n - 2 vertices and insert it in the current path: 1uv1 and 1vu1. After going through all of the vertices we choose the one which gave us the best result. Then we proceed to solve k = 3.

The problem is that it looks like this solution is not correct, because it fails the tests for this problem. I cannot prove that my solution is correct and I cannot disprove it. All I managed to do is to generate millions of different random trees and in all of these cases my clever solution matched bruteforce solution.

For now all my effort is directed towards generating a counter example in which my method will fail, but if it is correct I'd be glad to see the reasoning why is it correct.

• +17

 » 5 years ago, # | ← Rev. 3 →   0 Are all the k vertices required to be distinct?
•  » » 5 years ago, # ^ |   0 You are correct. We have to choose k distinct vertices, but we are allowed to cross a vertex more than once while walking along the path.
•  » » » 5 years ago, # ^ |   0 In your solution, are you maintaining distinct vertices somehow?
•  » » » » 5 years ago, # ^ |   0 Yes, I am maintaining the largest path that I currently have in the array: int next[1001] = {}; // linked list And I maintain the vertices that are already in the path in the array: bool used[1001] = {}; 
•  » » » » » 5 years ago, # ^ |   0 Your method is greedy.Suppose k=5, and you find the longest path 1->2->3->4 at k=4, now you can't choose these vertices again as targets. What if 1->2->5->3->4 is longer than 1->2->3->4->5 but 1->2->5 is shorter than 1->2->3 ?
•  » » » » » » 5 years ago, # ^ |   0 What if 1->2->5->3->4 is longer than 1->2->3->4->5 but 1->2->5 is shorter than 1->2->3 ? This is actually the level of my current progress =)I wasn't able to construct a real tree which fails this solution.
•  » » » » » » » 5 years ago, # ^ | ← Rev. 3 →   0 You are picking the longest path, and then back to 1, then the second longest and back to 1, third longest and so on, keeping in mind that you don't re-target a vertex. But you don't need to check 'back to 1' every time. Just reaching the target might be sufficient.Maybe a simple fix like terminating greedy at k-1 iterations, and then checking kth iteration + path to 1 for every un-targeted vertex will work. But I'm not sure. I haven't proved it yet.
•  » » » » » » » » 5 years ago, # ^ |   0 Maybe a simple fix like terminating greedy at k-1 iterations, and then checking kth iteration + path to 1 for every un-targeted vertex will work. Is my solution wrong? If it is, how does the tree look like which fails my solution?
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 2 →   0 I don't know if I can help you find a test case right now ( I am sleepy ) but I think the order of choosing vertices matter. So if optimal order has 3 before 4, and 1->2->3 > 1->2->4 but 1->2->3->1 < 1->2->4->1, you choose 1->2->4->1, therefore changing the optimal order.I mean, instead of checking 1->v->u->1, check only 1->...->v->u until k-1. At final k, check u->w->1.Not proven. Just an idea. Good luck :)Another idea : graph and k-cycle. Good luck :)And let us know if you get AC :)
•  » » » » » » » » » 5 years ago, # ^ |   0 You should also consider cases where your candidate paths have equal values. As the order of choosing matters, we might make a mistake here.
•  » » » » » » » » » 5 years ago, # ^ |   +3 This argument seems right, but a real tree that breaks this algorithm will help tremendously.
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 4 →   0 Looks like I have one :)1 2 102 3 202 5 201 4 20and k = 3Your algorithm chooses as follows : 1->3->1 or 1->5->1 . Let's say it chose 1->3->1.Now it has option 1->3->4->1 = 100 and 1->3->5->1 = 100. Now your algorithm chooses one of these without any other criteria, but we can see that the optimal order is 1->3->4->5->1.
•  » » » » » » » » » 5 years ago, # ^ |   0 My algorithm returns the length of 160 for that tree: http://ideone.com/NO769r.Is that correct?
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 6 →   0 Lol, yes! But only accidentally. You must've chosen 1->3->4->1 instead of 1->3->5->1, but they have equal probability of picking by your algorithm. :)The point is, your algorithm doesn't handle equal candidates well. I hope you get the point. I can't spend any more time on this :)I added some debug lines in your code your codeThe output suggests your algorithm doesn't do what you said it does. max_path_length=60best_v=4 best_increase=40 max_path_length=100best_v=5 best_increase=60 max_path_length=160Case #1: 160 If in the start, max_path_length=60, then when best_v=4, why is increase=40? Then when best_v=5, increase is 60? These values are not what I expected to see. Btw, the input for code is slightly different. I wanted to show that your code gives wrong answer on slightly changing the above example I gave.
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 2 →   0 I can't spend any more time on this :) I am spending my time on that, because it might be so that there has to made just a small incremental change and suddenly the algorithm becomes correct.For example, Dijkstra's algorithm is also greedy, but it works because we are traversing all the vertices in the right way. Maybe if we always choose the correct vertex (among all of the vertices that give the path with the same length), the algorithm suddenly becomes correct =)
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 2 →   0 your algorithm doesn't do what you said it does. It does exactly what I said it does =) the input for code is slightly different. The input is exactly the same as what you have suggested =)Each line describes an edge:parent weightparent weight
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 2 →   0 No, I changed the input.Swap positions of 4 and 5, like here :1 2 102 3 202 4 201 5 20 and k = 3The input format is different than how I presented. So I made some assumptions about the input format. Try this modified example.
•  » » » » » » » » » 5 years ago, # ^ |   0 Try this modified example. The result is 160: http://ideone.com/q2NzDS
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 2 →   0 Output using your algo should be : max_path_length=60best_v=4 best_increase=40 max_path_length=100best_v=5 best_increase=40 max_path_length=140Case #1: 140 According to your algo, answer is 140, which is wrong. According to your code, the answer is 160, which is correct, but your code != your algo. Just try with pen and paper and you will see why this is the output for your described algorithm.Your code is doing something else. Find the bug, and your algo will give wrong answer on this test case.
•  » » » » » » » » » 5 years ago, # ^ | ← Rev. 2 →   0 I can't find a bug: http://ideone.com/uQebyBThe algo constructs the paths like that:1 ⟶ 3 ⟶ 1  + 601 ⟶ 4 ⟶ 3 ⟶ 1  + 401 ⟶ 4 ⟶ 5 ⟶ 3 ⟶ 1  + 60 Your code is doing something else. My code does exactly what I have described.
•  » » 5 years ago, # ^ |   0 Btw, for the given tree, it's 13+12+10+5 No. The distance between vertex 2 and vertex 5 is 5 + 3 + 10 = 18.
•  » » » 5 years ago, # ^ | ← Rev. 2 →   0 ...
•  » » » » 5 years ago, # ^ |   0 I am sorry, I don't understand.Are you talking about the edge 3⟶4?This is the only edge in this tree that has length 2.
•  » » » » » 5 years ago, # ^ |   0 Oh I confused path weights and vertex numbers. My bad.
 » 5 years ago, # | ← Rev. 2 →   +14 solution forgetting about K below :(( wrongI haven't managed to prove your solution wrong, but I think I found one solution I can prove. If it's indeed correct you could use it to find counterexamples for yours. First let's find an upper bound on the maximum possible value of the sum. Analyzing each edge separately, it appears at most 2 * size(minimum_subtree_size) times in the path we are looking for, where minimum_subtree_size is the minimum size of the two subtrees formed when you delete that edge. Then summing that up for all edges, we get an upper bound on the result.I claim that upper bound is attainable. Lets root our tree in a node such that none of the subtrees generated by its sons has size greater than n / 2, so each subtree has size less than or equal n / 2. Then we can find a path (a cycle really, because we have to go back to node 1) where each pair of consecutive nodes in the cycle belong to a different branch of the tree. That path could be found greedly.Moreover, by the property of the selected root, all edges will fulfill the upper bound from the first paragraph. So we have a solution.I know this is somehow not very well explained, and there is certain probability of mistakes, so check it and let me know :)
•  » » 5 years ago, # ^ |   +11 How would you use K in your solution?
•  » » » 5 years ago, # ^ |   +4 hahah that was what I was missing :) sorry for that, I don't really know how to use it :((
•  » » 5 years ago, # ^ |   +3 Not sure what you meant exactly, but is it this : Make a graph with edge from each vertex to every other vertex and find the largest cycle of size k+1 in this graph starting and finishing at 1.
 » 5 years ago, # |   -18 Here is a solution: First, consider the problem where we do not need to return to the first vertex at the end. Then, one can notice that the optimal solution is to, at each step, visit an arbitrary farthest vertex from the current vertex (and mark the current vertex as visited). The complexity of the solution to this problem is O(N * K)However, now we can iterate over all possible ending vertices and perform the same algorithm. Total complexity would be O(N^2 * K).
•  » » 5 years ago, # ^ |   0 Why is everyone downvoting? If there's something wrong with the solution, could someone explain what it is?
•  » » » 5 years ago, # ^ |   0 I have upvoted your answer, because I appreciate any involvement in solving the problem.However, I don't understand your solution. Maybe that's the case for the others to downvote (but I'm not sure).You could write the program and see if it'll be accepted on the online judge.
•  » » » 5 years ago, # ^ |   0 I'm not sure but isn't this the same as solving the TSP greedily?
 » 5 years ago, # |   0 Since for each node of the route we only care about the next node(I will call it matching a node with another one), now we can do this: Dp[v][i][false,true]=the maximum lenght of a route in v's subtree such that we have i not matched nodes yet and true=the second node of the route is in this subtree/false=otherwise. O(n*k^2)
•  » » 5 years ago, # ^ |   0 How do you incorporate the fact that we have to go back to node 1 finally too ???
•  » » » 5 years ago, # ^ |   +8 Dp[1][1][1]
•  » » 5 years ago, # ^ |   +6 Can you explain in more details?
•  » » 5 years ago, # ^ |   +5 How do you keep track of which nodes you have already visited? You might have to get re-enter a subtree a few times to increase the length.
 » 5 years ago, # |   +3 managed to get a very large input where your solution fails (it's smaller than answer shown in udebug) https://puu.sh/uEbNc/f1d8e8f44c.in
•  » » 5 years ago, # ^ |   0 Great! That is the first progress for the whole time and all what was needed is a little help of the red guy =)Did you just generate a random tree or it has some special properties?
•  » » » 5 years ago, # ^ |   0 It is k=6 (or some near integers) chains attached to root 1.
 » 5 years ago, # |   +16 All I managed to do is to generate millions of different random trees and in all of these cases my clever solution matched bruteforce solution. You likely made some mistake in the generator/script. I have run your solution on hundreds of small tests and it was enough. 1 6 3 1 2 2 2 3 2 3 1 4 3 Your solution finds a path 1-5-2-6-1 with total length 24. The optimal path is 1-6-2-4-1 with total length 26.(drawn with CS-Academy tool)
•  » » 5 years ago, # ^ | ← Rev. 2 →   0 You likely made some mistake in the generator/script. I have run your solution on hundreds of small tests and it was enough. It looks like I had a bad generator. Now, looking at this tree it doesn't seem to me that my algorithm can be cured, though I'll try to think about it a little bit more. Well, what I notice is that this optimal path 1-6-2-4-1 can be built by looking for the furthest vertex from the current one. Again, I'm not sure that this will work in general =)
•  » » » 5 years ago, # ^ |   -8 Now that's exactly what I said :PPP
•  » » » 5 years ago, # ^ |   0 Well, what I notice is that this optimal path 1-6-2-4-1 can be built by looking for the furthest vertex from the current one. Again, I'm not sure that this will work in general =) This approach is easier to break. Consider a graph with edges: 1-2 of length 10, 1-3 of length 1, 3-4 of length 7, 3-5 of length 7. The optimal path is 1-4-2-5-1, while the greedy starts with 1-2-...
•  » » » » 5 years ago, # ^ | ← Rev. 2 →   -27 Checkmate =)I have no more new ideas. Wow, maybe we can run these 2 algorithms and choose the max between them? =)
 » 5 years ago, # | ← Rev. 2 →   +31 root your tree in node 1let us say there is an edge between node u and node v , if you choose to visit [ x ] nodes in the subtree of v then there is x nodes there and K — x nodes outsidethen you will traverse the edge min(x , K — x) * 2 times if you behave optimally and you can proof that you can always do thatit turned out to be a normal knapsack on a treedp[node][visited nodes in its subtree]call dp[1][K-1]
•  » » 5 years ago, # ^ |   0 How to prove that number of traversals are min(x , K — x) * 2? It is easy for a line tree, but I am not so sure how to prove for a general tree(or extend it). Thanks in advance!
•  » » » 5 years ago, # ^ |   0 It's easy. If x > k-x then you will go outside the subtree k-x times, as after going outside k-x times, you have no more nodes to visit outside the subtree. The remaining x — ( k — x ) nodes are inside the subtree, so you will not go outside the subtree now. Similarly for x < k-x.
•  » » 5 years ago, # ^ |   0 The Kth traversal is to the parent of 1(outside 1's subtree), and edge_cost(1,parent of 1) = 0. Is this right?
•  » » 5 years ago, # ^ |   0 I still don't understand. In a knapsack problem we have a set of things which have 2 attributes: weight and value. What is our set of things and what are theirs attributes?
•  » » » 5 years ago, # ^ |   +3 It's about distributing the nodes on subtrees.the value is calculated as i mentioned aboveBTW this is the "only" cool problem in all Arab Region contests :D
•  » » 5 years ago, # ^ | ← Rev. 2 →   +3 So your time complexity is O(N*K^2)?If your answer is yes then I think it's not good enough (there are t<=100 test cases in the problem).UP: My mistake. It's OK.
•  » » 5 years ago, # ^ | ← Rev. 2 →   0 1) Why do we call dp[1][k-1] instead of dp[1][k] ???? 2) We traverse an edge min(k,k-x)*2 times ,but say all the k nodes lie in the same subtree ,Then the edge joining will be traversed (0) times ???? Please correct me if I got anything wrong . Thanks in advance..