I_love_Tanya_Romanova's blog

By I_love_Tanya_Romanova, 9 years ago, In English

Hello everyone!

I want to invite you to participate in March Clash at HackerEarth. Contest is scheduled on this Sunday. It lasts for 12 hours, so almost everyone can find some time at least to try it :)

There will be five tasks in a problemset. Four of them are standard algorithmic problems with partial solutions allowed — you get points for every test that you solution passed. And the last task is an approximation problem — you score for this task depends on how good your solution is comparing to current best solution.

laoriu is author of this problemset — you may know him from preparing Codeforces Round 277 (Div. 2) and December Clash 2014. I was working on this contest as a tester. I would like to say that I find this problemset interesting and it was also a great experience for me to work with laoriu on preparing it. Also I want to thank to MDCCXXIX who helped me a lot with technical part of preparation.

Top 3 winners will receive HackerEarth T-shirt. (I already got mine for one of previous contests :) )

Good luck to everybody — I hope to see you at the scoreboard :)

(Update)

The contest has ended! Thanks to everyone for participating:)

Congratulations to winners:

1) mugurelionut

2) HellKitsune

3) pavel.savchenkov

| Write comment?
»
9 years ago, # |
  Vote: I like it +1 Vote: I do not like it

I already got mine for one of previous contests :)

Did you receive it via Inkmonk?

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I got email with link to stickystamp form, filled in that form — and t-shirt arrived in few weeks via FedEx.

    However, I also have some recent HackerEarth mails from Inkmonk (and it seems that those items have not been delivered yet), looks like different delivery services are used for different contests on that platform, depending on organizer of contest. I've already asked admins to change text of greeting in a way that at least allows us to understand for what contest this prize is:)

»
9 years ago, # |
  Vote: I like it +5 Vote: I do not like it

The contest has ended, thanks to everyone for participating :)

You can already find author's solutions for all algorithmic problems in editorials of these problems; today I will also add tester's solutions (just want to make them look like not so ugly as my code usualy does). Some time later detailed editorials will be also added.

Feel free to discuss problems here:) Also I think that admins of HackerEarth will be glad to see any feedback about platform and contest)

»
9 years ago, # |
  Vote: I like it -9 Vote: I do not like it

How to solve Connected Components ?

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    Let's assume connected components are black stones (N - M in total) and spaces between them are white stones (M in total). Let's use C black and C - 1 white stones to form the minimal possible chain of C connected components: bwbwbwbwb.

    Now we have to spread N - M - C black stones across C positions, then spread M - (C - 1) white stones across C + 1 positions ( + 1 because there are two extra positions for white stones to the left and to the right). So it's going to look like this: solve(N - M - C, C) * solve(M - C + 1, C + 1)

    Now what is this solve(int count, int positions) function? It's a choise of k elements out of n with repetitions allowed. solve() should return zero if count < 0, and C(count + positions - 1, positions - 1) otherwise.

»
9 years ago, # |
  Vote: I like it +12 Vote: I do not like it

I liked the problems, but they didn't seem very difficult. This was good because it didn't require contestants to work all the 12 hours on the problems (anyone could just start whenever they wanted — even starting solving the problems in the second half of the contest still gave one the chance to get a top spot). For me this was even better, because I participated in the contest mainly because I like challenge/approximation problems :) So I didn't have to spend too much time on the other 4 problems. That being said, I am curious how to solve "Bits transformation". My solution is a min-cost perfect matching algorithm (which I tuned in order to pass the time limit for the larger 3 tests). But all the solutions I looked at after the contest seem much simpler than that :) (a DP of some sort)

Regarding the HackerEarth platform: overall it seemed good. But there was this annoying bug with displaying the scores for the approximation problems in the contest standings. Basically, the score shown in the standings was always the score from the time when you made the last submission. What this means is that if you got the score of 100 during the contest and then stopped making submissions, a score of 100 would still be shown in the standings even if someone else made a better submission in the mean time (the correct score was shown at the end of the contest, though). This is fine as long as you're aware of the bug, but since this was only my second contest on HackerEarth, I was wondering what was going on. Thanks to I_love_Tanya_Romanova for clarifying this during the contest.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    The DP in Bits Transformation is basically d[i][j][k] — the lowest cost of passing i characters in string a and filling first j zeros and first k ones in string b with them. Since k = i — j, we drop it and it becomes an O(n^2) d[i][j] dp.

    At each step we look at how much it would cost to transform symbol a[i] into zero and into one, and how far the next zero and one are in string b. Then we assign d[i + 1][j + 1] = min(d[i + 1][j + 1], d[i][j] + cost0 + <distance from i to j+1's zero in b> / 2) and the same for d[i + 1][j] <k + 1>. We divide the distance by 2, because for each element not in it's place we can swap it with another element not in place, which has to go in the other direction.

    Reference solution: http://pastebin.com/qJ1xhawP (a bit messy :P and I multiplied everything accept for distance coefficient by 2 to avoid fractions).

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Since you got best result for the challenge problem, do you mind sharing your approach? :)

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 5   Vote: I like it +22 Vote: I do not like it

      I_love_Tanya_Romanova already described some approaches for the challenge problem, which lead to a very high score of 29224718. My score was slightly higher: 29315801.

      My solution consists of multiple greedy algorithms plus some randomization, all being run while time permits.

      The following greedy algorithms I ran only once:

      1) repeatedly choose the matrix which covers the largest sum (as long as it's positive) [my first submission consisted of just this and has a score of 96.17 of the final top score]

      2) repeatedly choose the matrix which has the highest sum/area ratio (as long as it's positive) [my 2nd submission contained just 1+2 and has a score of 96.44 of the final top score].

      3) skyline greedy: sort the matrices in decreasing order of area and try to place them from top to bottom so that as few empty cells are left (essentially trying to cover as much area as possible) — the good thing is that I mostly reused the code from a recent Codechef challenge problem for this part :) (CHPUZZLE from FEB'15 long contest)

      Then came the most important part of the algorithm, in my opinion. I sorted the matrices in descending order of area and then, while time permits:

      1) run a greedy which tries to place the next matrix in the first position it can, considering increasing rows — increasing columns order [my 3rd submission was just my 2nd submission + this greedy run only once (for the descending order of areas) : its score is 99.32 of the final top score and it would have been enough for winning the contest; if I had known, then maybe I would have participated in the Codechef cook-off instead of trying to optimize my solution further :) ]

      2) 3 more times same greedy, in order to cover all combinations increasing/decreasing row x increasing/decreasing column

      3) 10 times the same greedy, but for each piece the combination increasing/decreasing row x increasing/decreasing column was chosen randomly

      4) Change the order of the sorted matrices slightly : I randomly chose 10 pairs of matrices and swapped them in the sorted order, but I'm pretty sure something smarter could have been done here (like swap only "large" matrices between them or generate swaps only among the first x matrices in sorted order, where x would steadily increase after every iteration).

      5) If there is more time left, run the steps 1-5 again.

      So, as you can see, there was nothing fancy in my solution. In general I guess that there isn't time to try very complicated approaches in these 12 hour contests (compared to Codechef 10-day long contests or to Topcoder 1-2 weeks marathon matches, where there is time for testing more fancy approaches).

      As for the groups of tests, I am pretty sure that the highest scores were obtained for those tests which had all a(i,j)>=100 and the sum of areas of all the K matrices was larger than M*N. For these tests it made sense to try to cover as much of the large matrix as possible with the given K matrices, essentially ending up with a rectangle packing problem. I think a better scoring function would have been a relative scoring function, like: the sum covered by your placement of matrices / the sum of all the positive elements in the matrix. This way, the absolute score for each test would have been between 0 and 1 and it would have ensured that each test had an almost equal weight in the total absolute score. With the current scoring function, for instance, there were tests where the absolute score was around 1000 and tests where the absolute score was around 7.000.000. It's obvious that the score for the test with 1000 is almost irrelevant to the overall final score.

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Thanks for sharing your approach :)

        I agree that scoring function is far from being optimal and balanced. I wrote about similar problems with approximate task after February contest, and I kept that story in mind while we worked on this task — so at least number of tests was much higher this time:)

        I am not sure that at current moment HackerEarth allows using of scoring function you described here; however, I believe that your suggestions will be taken into account. Funny fact — while working on this problem, I was also curious how system will handle situations when high score for this problem is zero or negative :) And there actually were some submissions with negative score during contest — I think they have to be awarded with 0 instead of negative points.

        Don't underestimate role of other testcases. Mine solution (already visible in submissions list, later will also appear at editorial page) beats your 4-2 over cases 7-12 (those with large values in matrix), but at the same time loses more than 130 thousands points comparing to your at test #22. And 130k is more than difference between your and mine overal scores.

        P.S. MDCCXXIX said he think that issue with displaying wrong relative scores have been fixed now.

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          The scoring function I suggested is pretty simple. Basically instead of considering the score S, it would be S / sum of all positive numbers in the matrix. But if the platform requires that the score is printed by the submission itself and is not computed by a separate judge program, then it could simply ask the submissions to also output this ratio.

          I could see your submission in the list, but I couldn't find out how to see the code/its results. If viewing it is really possible, then the interface is not friendly enough to make it easy/obvious.

          • »
            »
            »
            »
            »
            »
            9 years ago, # ^ |
              Vote: I like it +8 Vote: I do not like it

            Thanks for pointing out how to view submissions [ it turned out to be just a matter of scrolling down in some part of the page :) ]

            You're right about the tests and not underestimating the importance of smaller tests. Your solution did score better than mine on most of the large test cases (and also on many of the smaller ones), but overall my solution managed to get a slightly higher total score.

  • »
    »
    9 years ago, # ^ |
    Rev. 3   Vote: I like it +1 Vote: I do not like it

    Yes, problems aren't very difficult in this problemset. It was decided to make problemset harder than in February (and looks like laoriu succeeded with it), but there is no point in giving really complicated tasks, taking into account overall level of contestants and also the fact that approximate problem is presented.

    Also looking at standings now I think that one easier task was required, to make this contest more interesting for beginners :)

    HellKitsune already described an approach to solve Bits transformation with DP. Idea of both author's and tester's solution for this task is following: it always makes sense to do all substitutions first, and all swaps after that; it is well-known fact that number of swaps required to sort a permutation is equal to number of inversions in it (you can always pick pair of adjacent elements forming an inversion and swap them, and if there are no such pairs — permutation is sorted already). Key observation — when you build first string char by char, adding new char at given position increases final number of inversions by fixed value, depending only on position of this char in target string. Therefore you can write a DP minimal cost of placing first x chars while y of them being 1's, where cost is equal to sum of costs for all substitutions and cost of swaps required to solve all inversions you already got from this prefix.

    I am surprised that so small number of users submitted approximate problem. It sounds like an easy points for me — writing anything that makes sense will give you a lot of points, and for me it sounds easier to implement some naive idea for this task than to solve some of algorithmic ones.

    In this task simply taking input in given order and placing every matrix in leftmost topmost possible cell that gives you no overlaps and positive covered sum was already scoring 24.1 millions (so it will give you ~82 points already), changing order of traversal from row-major to column-major surprisingly increases result to 24.78 millions :) Sorting matrices in order of decreasing size is enough to make 27.5 millions out of it (almost 94 points).

    What was your approach? I didn't spend much time on this task, besides testing it) And I wasn't writing anything smart; only tried naive one: while we still have time, shuffle input and then try to put it in greedy way, taking matrices in this order. It gave me 29224718, I'll add my code to editorial later.

    And I would like to read approaches of other contestants also :)

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      I suspected that the problems weren't very difficult by design (I guess that if I had read some problems from the previous contests with the same format I could have come to this conclusion myself). I think that's fine, particularly given that there is also an approximation problem in the problem set.

      I am also surprised that so few people submitted anything for the challenge/approximation problem. Actually, do we know how many people actually participated in the contest? (i.e. made at least 1 submission). The standings seem to include also people which simply registered, but made no submission.

»
9 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Can anyone explain the idea behind the problem counting on tree? It seems to be very hard to me.

  • »
    »
    9 years ago, # ^ |
    Rev. 2   Vote: I like it +1 Vote: I do not like it

    Simplest solution is one with DP on a tree. Let's count number of subtrees with root at given vertex and cost X. Answer for original problem will be equal to sum of numbers of such subtrees over all possible roots. How to calculate it? You have to pick root of a subtree and also some subgraphs from it's sons. But for every son part which you take is either empty graph or some subtree with root in this son. If you already solved problem for all sons — now you can write a knapsack-like DP to count number of ways to pick subtrees with given total cost. It gives you O(NK2) (which is enough to get AC, if implemented right) or O(NK), depending on implementation.

»
9 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

What was the official solution of Counting on Trees problem?

Looks like almost everyone coded O(NK2). I tought it is too much for 2 seconds time limit.

By the way kudos to Vuong Nguyen for his wonderful O(NKlogN) approach.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Before reading your comment I didn't even think that a complexity better than O(N*K^2) was possible. The limits seemed to suggest that a better time complexity isn't needed. I now looked at the author's solution and it's really cool how he computes the total number of required sets containing a fixed node (the current centroid) in O(csize*K) time (where csize is the size of the subtree of the current centroid). I wonder then why he didn't choose larger limits for K.

    As for other approaches for solving the problem better than O(N*K^2), it is at least theoretically possible to solve it in O(N*K*log(K)) using FFT. If a nicer number were used instead of 10^9+7, then FFT could have been a feasible solution. But under these conditions I think one would need to do FFT on real numbers, which could even exceed 64 bits, so it would be kind of painful (plus possibly too slow in practice).

  • »
    »
    9 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Original idea of laoriu was about centroid decomposition. At some moment there was an intention to pick larger constants (like 50000/1000 instead of 50000/100) to prevent naive solutions from passing, but then we decided to not do it in order to make contest simpler.

    As mugurelionut said, one can try FFT, but it is a bit fancy idea in this case :)

    O(NK2) isn't bad, it gives you around 2.5 * 108 operations. It isn't very much for 2 seconds. However, messy implementation with 5 * 108 operations should give you TL and something like 60 points.

    But you are wrong about almost everyone coded O(NK2). I checked AC solutions from a contest now, and there are lot of O(NK) solutions there. Mine solution from editorial page also runs in O(NK).

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +12 Vote: I do not like it

      Sorry, but your solution from the editorial is O(N*K^2): for each (node, child) pair (there are N-1 such pairs) you have two nested loops: for i, for j, each loop running over O(K) values. Why do you say that it is O(N*K)?

      I also looked at all the accepted solutions from the contest and all of them use essentially the same idea, which is O(N*K^2).

      • »
        »
        »
        »
        9 years ago, # ^ |
        Rev. 2   Vote: I like it +14 Vote: I do not like it

        Sorry, but my solution from editorial is O(NK). Of course, one can say that it is O(NK2) :) But better bound of O(NK) is also true. I thought it is well-known at least for experienced contestants, so I am surprised that you said that and also a bit surprised that you got so many upvotes for this claim already.

        Doing all loops up to min(K, S), where S is size of a subtree (and therefore merging two subtrees in O(min(S1, K) * min(S2, K))), makes O(NK) out of O(NK2). I've already seen it few times in Russian and Polish contests, where tasks like N=100000, K=1000 were given. It was also discussed at this CF blog entry (but it is in Russian). And here is also an explanation/proof from Burunduk1 provided in that topic (sorry, it is also in Russian, but at least all formulas should be clear).

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Thanks for the explanation. Of course it's true that O(N*K) is also O(N*K^2) from a theoretical point of view :) But I actually thought that the better bound of O(N*K) didn't hold. I didn't realize that simply bounding the for loops to the sizes of the corresponding subtrees can have such a big impact.

          If K were larger, what I would have done would be to simply store the maximum value KMAX(x) for which a node x has a non-zero count (of course, KMAX(x) is upper bounded by K) and then only iterate up to it. Apparently doing this simple and obvious optimization is actually equivalent to iterating up to the sizes of the subtrees (since KMAX(x) cannot be larger than the size of node x's subtree). I think I probably did this optimization in several problems in the past without realizing that it might actually also improve the theoretical time complexity, not only the running time in practice.

          Another optimization which is obvious and very simple (and which I used in my solution) is: if the element in the outer for loop has a value of zero, do not enter the inner for loop. This basically ensures that the outer for loop iterates up to min(K, size of the subtree) (plus a few extra increments of the outer loop counter which do not matter much overall). Although the inner loop would still iterate up to K even when the size of the corresponding subtree is less than K, this actually improves the running time a lot. I submitted my solution both with and without this optimization for practice and the maximum running times are 0.25 sec (with) -vs- 1.25 sec (without). I wonder if the theoretical time complexity is better than O(N*K^2) in this case, too.

          • »
            »
            »
            »
            »
            »
            9 years ago, # ^ |
            Rev. 2   Vote: I like it 0 Vote: I do not like it

            Yes, it is a common practice to make some asymptotic optimizations without even realizing it:) At least for me :)

            It seems that your second optimization does not improve asymptotic. However, it is a good one when author does not want naive solution to pass, but isn't very good at test generation :) But here is a counterexample.

            Upd. And even simpler — take a tree with root and N-1 leaves.

»
9 years ago, # |
  Vote: I like it +5 Vote: I do not like it

Inconvinient time for EST, 24 hour format will be better.

»
9 years ago, # |
  Vote: I like it +9 Vote: I do not like it

The problem-set was really nice ! Congratz to to authors , i will look forward to participate to monthly clashes from now on !

»
9 years ago, # |
  Vote: I like it -8 Vote: I do not like it

when will the detailed editorials be added ?

»
9 years ago, # |
  Vote: I like it +1 Vote: I do not like it

The problems were interesting indeed, especially "Bits Transformation". Initially, I thought it is yet another max-flow problem, but then I came up with nice straightforward dp.

By the way, is it OK, that I haven't received any email about the prize yet? :)

(And I didn't find any place to write my address or something)

  • »
    »
    9 years ago, # ^ |
      Vote: I like it -8 Vote: I do not like it

    Hi Pavel,

    We send all the prizes for all contests in the end of the month so probably you will receive email for your prize in between 30th and 31st March :)