Hello Codeforces,

I invite you all to participate in **HackerEarth's July Clash** on July 18th, 2015 at 06:30 UTC. The duration of the contest is 24 hours. I (FatalEagle) am the author of this contest. zxqfl is the tester, shef_2318 is the Russian translator, and belowthebelt oversaw the production of this contest.

The problemset consists of 4 traditional algorithmic tasks and 1 approximation task. For traditional algorithmic tasks, you will receive points for every test case your solution passes. For the approximation task, your score depends on the best solution in the contest so far. All tasks will contain subtasks, so you will know exactly how much you should expect to score if you submit a suboptimal solution. The subtasks are designed to be approachable for most people, so even if you don't have any clue how to solve the whole problem, you can earn a decent amount of points on the subtasks.

There are **prizes** for the top 5 contestants:

- $100 Amazon gift card and HackerEarth T-shirt
- $80 Amazon gift card and HackerEarth T-shirt
- $50 Amazon gift card and HackerEarth T-shirt
- HackerEarth T-shirt
- HackerEarth T-shirt

Good luck to all contestants!

P.S. I'm particularly interested in seeing IOI 2015 participants on the leaderboard. It is a little more than a week before the start of IOI 2015, and I hope participants will find the problemset worthwhile to solve as preparation.

**UPD: More than half the contest is over, about 10 hours and 30 minutes remaining from time of posting. Only 2 contestants managed to solve 4 or more problems — will you be the third?.**

**UPD2: Congrats to the Top 5!**

1. anta

2. HellKitsune

3. qwerty787788

4. mugurelionut

5. Kmcode

Getting extremely nervous and excited at the same time looking at the leaderboard. Will mugurelionut be able to dethrone the top 3? He's there... almost there, it seems! :D

Oh, well, I wasn't there in the end, it seems :) My ideas for the challenge problem were not good enough and I spent way too much time on solving the problem "Upgrade" (because my solution uses the sqrt trick instead of splay trees, so it was timing out a lot until I eventually managed to optimize it enough).

I would be interested in finding out how the other contestants (especially the top ones) approached the challenge problem, I will also share my approach later, when I have more time to write it.

And to the author: The problem set was very nice! Congratulations. But I do have one comment about the challenge problem: Due to the way the score was computed (sum of scores of all the test cases), this made the score on test cases with smaller K (where the answer could be higher) much more important than the score on test cases with larger K (where it was more difficult to get similarly good answers). A scoring function which made every test have similar weights would have been better, in my opinion.

Also, did anyone get positive scores for tests 60 and 91 of the challenge problem?

My approach for the challenge problem:

K = 1 — let's pick the value with the most bits set from A. Now let's pick the value from B such that A[i] | B[j] is maximized. Most of the time this should give you all bits set.

K = 10 — let's pick such values A[i..i+9] that their sum is maximized. Now when picking values from B let's calculate the total result with brute force (10x10 iterations) and compare it to the best result so far.

K > 10 — let's pick both A[i..i+k-1] and B[i..i+k-1] such that their sum is maximized.

Also you can stop searching for better values in the current array and have more time for another one if you come against two zeroes in a row (possible since the values b0 b1 d0 d1 are generated randomly), but this doesn't change the score too much (0.1-0.2 final points).

My approach didn't have special cases for small and large K. Reduce N to something small (e.g. around 600000-700000). Then compute all A[i]'s and B[i]'s up to N. Then compute a score for each interval of K consecutive values from the arrays A and B (I used the sum of the most significant bits, enough so that the result fits an unsigned int — but this made it slower, because I iterated it over bits ; probably doing normal sum over long long would have been faster).

Then I kept the best X values of A and Y values of B (X around 1500) and tried all their combinations for finding the best (x,y) pair. The score of a combination is exactly the sum of the elements (I can compute it by iterating over bits and deciding how many bits of each value exist in the KxK rectangle). For K=100000 this could have exceeded long long, so I only used the most significant bits here (from bit 29 to bit 9, thus excluding bits 8...0).

Maybe a batter approach would have been to use a higher limit of N and a lower limit of X and Y. This way, I could have found more promising intervals, but I would have tested fewer combinations. Or simply let X be large and Y be small, or something similar, in this case.

For K=1 I also thought about finding the best matching A[i] and B[j] (which have the most bits together), but I couldn't find an efficient way to do this in the late hours of the contest, so I just kept the approach I already had and tried to tune its parameters more. I would have liked to spend more time on the challenge problem (which is the main reason I participate in Hackerearth contests), but it took me many hours (from the evening until the morning) to get 100 points on the problem "Upgrade". And in the overall standings, getting the 100 points on "Upgrade" was more valuable than a few more percentage points on the challenge problem.

My approach was just to do your K > 10 approach. I actually tried computing values in the square, but found that way too slow (I couldn't choose as many random points). Just doing that was able to get a score of 98.82. I tried a few different things, but nothing seemed to improve it that much.

My solution prior to isolating the K = 10 case has 97.84 points, and the one after has 99.06. For a hundred though it was definitely too slow and the score went down.

Last question: nope for me. It could be that the differences between averages were mostly too small in these tests, so rounding down cut the score to zero...

The challenge problem this time had an initial barrier, with having to figure out how to compute scores efficiently and deal with large

N. I just cut downNto at most 3·10^{5}, foundA[] andB[] and did prefix sums on the most significantbbits, which allowed me to compute the cost for fixed (x,y) inO(b). Next, I tried many random positions and chose the best one, sometimes also chose a few rows and tried allxin these rows, or the same with a few columns. And small optimisations like smallerNfor smallerK(surprisingly, an additional constraint ofN≤ 200Kgave better scores) or smallbfor smallK.Splay trees, huh? I did google something like that about some reversals and trees, but with the sleeping time I had the last few nights (4 hours before TCO), I was glad to write the 45pt subtasks at least. At least I have another thing to add to my copy-paste library.

The sqrt trick solution was found to time out during the testing phase, but I set the time limit to 8s because I didn't want potential Java solutions to time out (my own solution in C++ worked in < 3s on all cases). I didn't realize Java already had a 2x multiplier to the time limit, so maybe the larger then necessary time limit led to people thinking some slower solution had a chance of passing. Sorry if that was what happened to you.

I'm not very experienced in making challenge problems; I expected the score cluster in the 70-100 range, but I figured it was fine since top 5 contestants should already have full score on other problems (I was off by a little here). Can you suggest a better scoring function, so I have an idea of what is good and what isn't next time? Or maybe I should just drop the small K cases?

The grids in tests 60 and 91 are composed of mostly 0's because of the way the arrays are generated, so the optimal solution is (0, 0). I considered removing them, but then the tests wouldn't really be randomly generated, and it felt a bit dishonest.

I will explain the "sqrt trick" solution that I was referring to in my first comment, to make sure we're on the same page.

For a line (i.e. in the array case), you can always describe the current status of the array relative to the intervals (in ascending or descending order) of the initial array. For instance, the current array can have: the values from positions [3..5] of the initial array in ascending order, followed by the values from the positions [7..10] of the initial array in descending order, etc.

Initially, the status of the array is just one interval (the interval [1..N] in ascending order). Every reverse operation (u,v) adds at most two more such intervals (by splitting some existing intervals so that the positions u and v are interval endpoints and then simply reverse the intervals between u and v (in time proportional to their number).

If you apply this algorithm, it will be quite efficient in the beginning, but it will become less and less efficient, as the number of intervals grows. To compensate this, once there are too many intervals (e.g. O(sqrt(N)), we rebuild the whole array in linear time. What this means is that the we recompute the values for each position of the array and now say — this is the new initial array, thus reducing the number of intervals back to 1.

Overall, this approach clearly takes O(sqrt(N)) amortized time per operation.

So what I did was to apply this approach to the paths of a heavy light decomposition. Every path is described as a sequence of intervals (in ascending or descending order) of positions of paths from the initial tree, e.g. path 1 consists of: the positions [3..5] of path 2, in ascending order, followed by the positions [7..10] of path 3 in descending order, followed by the positions [2..4] of path 1 in descending order, etc.

Every operation introduces some splitting points, in order to be able to properly gather all the "original" intervals contained in an interval of positions of a path from the decomposition (and we know that the path between two vertices u and v consists of several such intervals of positions of at most O(log(N)) paths from the decomposition).

Then, for a reverse operation, the "original" intervals forming the path from u to v are reversed. This is not as simple as changing their order, though. When swapping and reversing two intervals from two different paths we need to first "trim" the longer interval to the length of the shorter one (i.e. split it in two) and then swap the equal length pieces and then continue. Thus, many more splitting points (i.e. intervals) are introduced this way (O(log(N)) additionally per operation).

Finally, one last piece of the algorithm is to rebuild the whole tree (i.e. reduce the number of intervals to 1 per path) when there are too many intervals overall.

This approach gets TLE and it took me a long time to optimize it enough. First, I introduced a "re-merging" operation, to re-merge consecutive intervals of the same path, which can form a larger interval. Then, in my solution, I repeatedly needed to find out the interval containing a given position P. I tried linear search, STL sets, a combination of them, but they were too slow. The final thing was to notice that in most cases I already had this information readily available, so I just used it.

I'm not sure what the final time complexity of my solution is (something between O(N*(log(N) + sqrt(N))) and O(N*log(N)*sqrt(N)), I guess).

For the scoring function, one suggestion is to weight the scores by a function of K to make the scores comparable similar for larger K cases. Also, if this is possible, another suggestion would be to normalize the scores per test case based on either ranks or the raw scores (i.e. give a score of 0-1 per test case, which you sum up later). This is basically a stronger version of the previous suggestion. This scoring would also generalize to other challenge problems.

As an example, when I was a TA, we had an approximation problem in class, and the way we scored was as follows. For each test case, rank the teams from 1 to N. Then, give the team ranked i a score of c^(i-1), where c is some constant chosen between 0 and 1. If there are ties between ranks i to j, give them a score of (c^i+...c^j)/(j-i+1). This scoring rewards getting the best score on a few test cases, so getting a few very good cases would outweigh doing averagely on most cases (which may or may not be desirable). Also note that this method applies to any approximation problem. Of course, this method may not even be possible if you can't do normalized scores based on ranks.

How to solve D?

I used HLD where each path of decomposition is a treap. You can cut the chunks of paths from u to v out of HLD, merge them into one treap, reverse and veeeery carefully put it back.

Probably possible with splay trees too as mentioned above, but I have no experience with them.

Glad to say I am impressed with the problem set.

My favourite task was C as it jogged my memory on the Divide & Conquer dynamic programming optimization, which I rarely come across in programming contests.

Thanks for solving the problems!

My solution for C also uses Divide and Conquer optimization, but tester's solution used convex hull trick with a better asymptotic complexity. But on the test cases, it turns out that my solution was a bit faster on the worst case, and was also much easier to code (which leads to little chance of bugs and works on the first try), so we decided not to force a better solution. In hindsight, I suppose the convex hull trick solution is more obvious to people, after looking through some of the submissions. I think it's nice having a choice of whether to code the first solution that comes to your mind which will work — and maybe spend more time debugging — or think for alternative approaches and find a simpler one.

Convex hull trick is kinda hard to code on the first try when you don't have a grasp on what to do, but becomes much easier later (as it is with most algorithms, anyway). I haven't tried it for 2 years and yet pulled it off quite easily. And there was no adding/removing lines necessary here — even if the towers should be placed from left to right, nothing else can give the optimal solution.

Thank you for preparing these interesting problems. Particularly ,C was very nice. I knew convex hull trick is one of the Dynamic Programming Optimizations,but I didn't understand how does it work. After reading editional,I could understand about convex hull trick completely. I couldn't become a IOI 2015 participant, but I thought that this problemset was worthwhile to solve as preparation for future IOI.

Hi, it was a really nice problem set, thank you.

In problem C, I don't really understand why is it always enough to use less than 60 towers. The editorial says that after that point it's simply better to put just two towers at positions 1 and N.

But suppose,

N= 10^{5},K= 101 and allA_{i}= 1024. We can put 101 towers such that the distance between two consequtive towers is 1000. Now, if we calculate the cost it will be aroundK^{2}* (K* 1024) + 1000^{2}* (K- 1) = 101^{3}* 1024 + 1000^{2}* 100 ≈ 1.1 * 10^{9}, but if we put just two towers at positions 1 and N the cost will be aroundK^{2}* 2 * 1024 + 100000^{2}≈ 10^{10}.I also noticed that you used Divide and Conquer optimization in your solution for that problem. Can you, please, explain how did you come up with an idea that we can use this optimization technique here? I learned the technique recently but fail to use it properly.

It is always enough to use less than 60 towers, but the editorial is a bit incomplete about the reasoning — in your example, the optimal solution uses 32 towers. In general, an upper bound on the number of towers you need (placing towers at 1 and N) gives

. This is about 266 for the worst case (I tried all possible values of N and K). My solution posted in the editorial uses this bound, and takes a bit less than half the time limit to execute on the worst case. By noticing that the example you gave leads the the maximum number of towers used for some value of K, you can use trial and error to find the maximum bound (use WolframAlpha to get an approximation first). The worst case is when K = 56 or 57, where you need to use 56 towers to get the optimal solution. This is one of the tests.

I came up with divide and conquer idea when I wrote the naive DP, and it turns out that all decision points were monotonic (at least, in all cases I could generate). So I did some research and it turns out the cost function satisfies the quadrangle inequality. I imagined this: for some two invocations of the cost function, they represent two overlapping squares. Then, when you subtract the intersection of both (also a square), it's clear that the remaining amount is smaller than the union of the two original squares. From this comment it seems like this is enough to use divide and conquer optimization.

Thanks for the correction, I updated the editorial.