When submitting a solution in C++, please select either C++14 (GCC 6-32) or C++17 (GCC 7-32) as your compiler. ×

Ant_Man's blog

By Ant_Man, history, 3 years ago, In English

How to solve B, E, F, G, H, I, J?

  • Vote: I like it
  • +54
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it +41 Vote: I do not like it

F: The idea comes from Petrozavodsk Summer 2016. Day 7: Ural FU Dandelion Contest. F. Suffix Array for Thue-Morse.

Build the compacted suffix automaton of $$$fib_n$$$ and you can find some nice structures.

  • Every fibonacci word ends with ab and ba alternatively.
  • The states corresponding with $$$fib_n, fib_{n-2}, \dots, fib_{n \bmod 2}$$$ are accepting states.
  • Except with the edge in the main string, there are $$$\min(n-2,0)$$$ extra edges.
  • Every extra edge starts from some position like $$$fib_i-2$$$ and ends at $$$fib_{i+1}-1$$$. The label in the edge is determined by the parity of $$$i$$$.
  • We will never go through two extra edges consecutively.

An example for $$$fib_6$$$

       _________a__________                _________________a_______________
      /                    \              /                                 \
0-a->[1]-b->[[2]]-a->[3]-a->4-b->[[5]]-a->6-b->7-a->[8]-a->9-b->10-a->11-a->12-b->[[13]]
\_____b_____/      \__________b_______________/

The [[x]] means the accepting states, and the [x] means a fibonacci word.

It can be seen that we can only use $$$O(n)$$$ states to represent $$$fib_n$$$. And the suffix array can be obtained by walking the compacted suffix automaton. Since the maximum index is $$$10^{18}$$$, we can only keep the last $$$O(\log 10^{18})$$$ nodes.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Is the second extra b edge going from 3 to 7?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    Is there a general method for solving this problem on various morphic words? For instance, what about tribonacci: $$$a \to ab, b \to ac, c \to a$$$?

»
3 years ago, # |
  Vote: I like it +21 Vote: I do not like it

E: It can be seen that $$$dp_x$$$ equals to some $$$a_i+b_{i+1}+\dots+b_{x}$$$. Let the prefix sum of $$$b$$$ be $$$s$$$,

$$$ dp_x=\max_{i=0}^{x}(a_i + s_x - s_i)=s_x+\max_{i=0}^{x}(a_i - s_i) $$$

Use segment tree to maintain the value of $$$a_i-s_i$$$.

»
3 years ago, # |
  Vote: I like it -48 Vote: I do not like it

Meh, nine <=medium problems that can be done in <=2.5h and one boring killer.

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +101 Vote: I do not like it

    I agree but to be fair top-5 teams seems too strong now. Problems are either <=medium or almost unsolvable. It sounds stupid but... if it can be solved, USA1 will do it in 20 minutes :)

    I'm thinking about preparing contest for Ptz and OpenCup and I'm very scared that you guys (and USA1 and Past Glory and japan02) will solve everything (maybe but 1 problem) in 3 hours and I don't really know what to do about it. I mean, if I can solve it, you'll solve it faster and better.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    A tiny idea: try to stay away from Opencup for more than 3 years and you might probably find difficulty increasing.

    Just kidding. Next time I will try to provide harder tasks for you. XD

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it +2 Vote: I do not like it

      Another idea for me could be to get rid of carrying teammates xD

»
3 years ago, # |
Rev. 2   Vote: I like it +28 Vote: I do not like it

My solution for $$$H$$$ is supposed to be $$$nlogn$$$ but I have been getting TLE with it. Here's my code if someone could help me find out why I'm TLEing: https://ideone.com/AirWlG and I will explain my idea below.

Let's look at the $$$sum$$$ case first. Let $$$totalSum(m)$$$ be the sum of all hamming distances for $$$S^m$$$.

Since $$$S^m = S^{m - 1} + [m] + S^{m - 1}$$$, it's easy to see that $$$totalSum(m)$$$ can be defined recursively as $$$2 * totalSum(m - 1) + subSum(m)$$$. Where $$$subSum(m)$$$ is the sum of the hamming distances of the subarrays that contain $$$m$$$ in $$$S^m$$$ (I will explain the base case later).

Let's solve $$$subSum(m)$$$. Let $$$d$$$ be the smallest $$$d$$$ $$$s.t.$$$ $$$\lvert{S^d}\rvert >= n - 1$$$. The middle part of any $$$S^m$$$ $$$s.t.$$$ $$$ m > d$$$ will look something like this:

$$$[..., v_1, v_2, ..., v_{n - 1}, m, v_1, v_2, ... v_{n - 1}, ...]$$$

Key observation is that for any $$$m > d$$$, All $$$n$$$ subarrays we want to consider in the calculation of $$$subSum(m)$$$ will only differ in the element with value $$$m$$$. I will refer to these subarrays by their order from now on. (e.g. $$$[v_1, v_2, ..., m]$$$ is first subarray, $$$[v_1, v_2, ..., m, v_1]$$$ is the second, and $$$[m, v_1, v_2, ..., v_{n - 1}]$$$ is the $$$n^{th}$$$

Let's define $$$F[i]$$$ to be the hamming distance of the $$$i^{th}$$$ subarray $$$without$$$ the element $$$m$$$. Now $$$F$$$ is the same for all $$$m > d$$$.

Now $$$subSum(m)$$$ is $$$(\sum\limits_{i=1}^n F[i]) + n - freq[m]$$$. Where $$$freq[m]$$$ is the frequency of element $$$m$$$ in array $$$a$$$. (This is because element $$$m$$$ will be compared against event element in $$$a$$$).

We now need to compute 2 things:
(1) $$$F$$$
(2) $$$subSum(d)$$$ (To use this as a base case for our recursive definition above).

We will compute both of them in a similar manner.

Define $$$H(a, b, k)$$$ to be a function that returns a vector $$$ans$$$ of size $$$k$$$. $$$ans[i]$$$ is hamming distance between $$$a$$$ and the $$$i^{th}$$$ cyclical shift of $$$b$$$.

With some goofing around, you can see that you can use this function (or something similar) to get:
(1) $$$F$$$ = $$$H(a, [v_1, v_2, ..., v_{n - 1}], n)$$$.
(2) $$$subSum(d)$$$ = $$$H(a, S^d, length(S^d) - n + 1)$$$.
(I can explain them in another comment if someone wants to but I don't want to make this text any longer than it already is :c).

Key Observation to calculate $$$H$$$ is to see that the $$$b$$$ we send is a special array. It has at most $$$log_2(n)$$$ unique values, and the distance between two occurrences of any element of $$$x$$$ is $$$2^x$$$. You can combine these 2 observations to calculate $$$H$$$ in $$$log_2(n) * k$$$ (I believe its calculation is clear in the code but I could explain it in another comment too if someone wants).

The $$$minimum$$$ case is similar. Define $$$minHamming(m) = min(minHamming(m - 1), minSubHamming(m))$$$.

$$$minSubHamming(m)$$$ will either be $$$minF$$$ or $$$minF - 1$$$. Where $$$minF$$$ is the minimum value in $$$F$$$ calculated above. The case where it will be $$$minF - 1$$$ is when the element $$$m$$$ matches its counterpart in $$$a$$$. You can iterate over indices $$$i$$$ of element $$$m$$$ in $$$a$$$ and minimize with $$$F[n - i - 1]$$$ Since this is the only subarray that will make $$$m$$$ match in index $$$i$$$.

»
3 years ago, # |
  Vote: I like it +11 Vote: I do not like it

Was solution for D just implication of Sylvester's determinant theorem, like $$$det(M)=x^{n}(1+\sum_{1}^{n}\frac{a_ib_i}{x})$$$? Are there some special cases?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    There were no special cases

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +20 Vote: I do not like it

    (for future reference)

    "Sylvester's determinant theorem" refers to this.

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +28 Vote: I do not like it

    I think this theorem is overkill here. You can subtract first row multiplied by $$$a_i/a_1$$$ from all other rows, then you get matrix with zeroes everywhere except first row, first column and diagonal and this case is obvious.The only special case is when $$$a_1 = 0$$$, then you need to swap it with some nonzero $$$a_i$$$.

»
3 years ago, # |
  Vote: I like it +66 Vote: I do not like it

Several teams (including us) missed this case on H:

7 3
1 1 2 1 3 1 2

The answer is 6 6 (there is only one orientation allowed, and it has distance 6), but it's easy to print 1 6 if you accidentally allow the shift i=2 when computing the min. It's a one or two line fix but unfortunately wasn't in the test data.

»
3 years ago, # |
  Vote: I like it +18 Vote: I do not like it

How to solve I.. ?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +15 Vote: I do not like it

    Do a straight-forward DP, like, f[i][last 3][second last 3]. Observe that generally f[i][][] and f[i + 1][][] differs by at most one row. Hence, it can be compute f[i + 1][][] from f[i][][] with O(n) efforts.

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

For J first you make an observation that if jailings intersect in interesting way then their sets of boundary cells intersect. You also note that number of boundary cells of jailing is at most linear in terms of cells with particular value corresponding to that jailing that lets you iterate through it.

First, for each cell you gather all values such that this cell lies on boundary of corresponding jailings. Then for each value you iterate through all its boundary cells and check it with all values gathered there. I guess it works in $$$O(area)$$$, but no idea why.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    What about the test cases of the following format?

    zzzzzzz
    yyyyyyz
    xxxxxyz
    aaaaxyz
    bbbaxyz
    ccbaxyz
    dcbaxyz
    
    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      ¯\_(ツ)_/¯

      So it's $$$O((area)^{\frac32})$$$ there. Either there were no such cases or it works fast anyway since TL was quite generous.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it +20 Vote: I do not like it

      I'm not sure about our proof, we had many handwave parts, but at least on this case in is linear if you distinguish between vertical and horizontal boundaries and compare only vertical with horizontal.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +18 Vote: I do not like it

    We made a few big insights:

    1. The total height of all jailings is at most $$$O(area)$$$, because a connected component must have at least as many cells as its jailing height. Likewise for width.
    2. The rectangular boundaries of any two jailings cannot intersect at 4 points; it must be exactly 0 or 2 points. This is because we're building the jailings from 4-connected components of cells. This gives a way to compute the answer by considering just the edges of the jailings (you need to choose tiebreaking appropriately, e.g. by using the area of the jailing).

    Just observation 2 gives a $$$O(area * log)$$$ solution using a plane sweep, but combining observation 1 lets us directly walk the boundaries for an $$$O(area)$$$ solution (you need to process rectangles in tiebreaking order).

    One other helpful conjecture is that the total number of intersecting pairs is at most $$$O(area)$$$; does anyone have a proof/counterexample?

    • »
      »
      »
      3 years ago, # ^ |
      Rev. 3   Vote: I like it +10 Vote: I do not like it

      To be honest I don't really understand your idea. What is the value added by 2nd observation compared to what has been said before and how do you use it to get any solution? In a case where edges of two jailing coincide they intersect in many more than 2 points. In cases when they intersect in a corner only that is only 1 point. So it's not entirely true it is 0 or 2 point. And what tiebreaking are you talking about?

      (I edited my comment a bit)

      • »
        »
        »
        »
        3 years ago, # ^ |
        Rev. 3   Vote: I like it +28 Vote: I do not like it

        The key to avoiding these edge cases is adding a tiebreaking order so that no two coordinates are exactly equal. First, sort all rectangles so that rectangle $$$i$$$ contains rectangle $$$j$$$ only if $$$i < j$$$, e.g. by area from biggest to smallest. Let $$$R_i = [xlo_i, xhi_i) \times [ylo_i, yhi_i)$$$. Then, instead consider the modified rectangle $$$R_i' = [xlo_i + i \varepsilon, xhi_i - i \varepsilon) \times [ylo_i + i \varepsilon, yhi_i - i\varepsilon)$$$. Because our ordering goes from big to small, you can check that this modification doesn't affect which rectangles contain which, so $$$f(R_i', R_j') = f(R_i, R_j)$$$. Then, all coordinates are distinct, and our observation that the borders of 2 modified rectangles intersect at exactly 0 or 2 points (in fact, exactly $$$2 f(R_i, R_j)$$$) holds.

        Once you have these, you can treat each (modified) rectangle as two horizontal and two vertical line segments. For each rectangle, to compute $$$s$$$, you find the sum of the indices of all other segments which intersect with these (with multiplicity) and divide by 2. This is a simple plane sweep. The first observation means that you can just walk the length of the segments instead, though I think you have to do O(original length), not O(number of post-tiebreak coords) (maybe someone can prove that the latter is small?).

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How solve G without tl issues?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    What's your complexity?

    • »
      »
      »
      3 years ago, # ^ |
      Rev. 3   Vote: I like it 0 Vote: I do not like it

      $$$O((n+m)\log n\log 10^8)$$$ We got AC after contest, but it took too much time to optimize it.

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it +20 Vote: I do not like it

        It is actually possible to do it in $$$O(n log n)$$$. Let P be our query point. First you determine how long horizontal and vertical rays starting from P we can draw. They can be determined using sweeping. They give you some upper bound on the answer. If answer is smaller it is because there is another point within rectangle formed by these rays. Depending on whether this obstacle point lies above or below diagonal ray starting from P, it gives upper bound corresponding to its x or y coordinate. You can determine these using sweeping as well. Solution by mnbvmar

        • »
          »
          »
          »
          »
          3 years ago, # ^ |
            Vote: I like it +10 Vote: I do not like it

          You can do everything in one sweep (right to left), just keeping a regular segment tree for vertical segments and a set for horizontal segments. To get the answer, go down the tree and query the set. Code.

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

What was the correct idea for B?

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +10 Vote: I do not like it

    Binary search on answer, then go left to right and greedily assign each symbol to the currently shortest sequence that want it now. It is easy to show that it is always better to append 2 to empty sequence instead of [20] and it is always better to append 0 to [2] instead of [202].

    I don't think that it is always correct, but for 2020 it is.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Thanks, with binary search this kind of greedy becomes reasonable and easy to understand!

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Were there wrong ideas that passed ?

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it +1 Vote: I do not like it

      No, I just had some harder ideas, which seemed correct, but eventually appeared to be wrong.

»
3 years ago, # |
  Vote: I like it -8 Vote: I do not like it

How to solve problem A (Arrange And Count) from div1 ? We thought about DP ,but we didn't manage to solve it.

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +10 Vote: I do not like it

    The operation is basically: reverse the sequence and rotate it. So after an even step you can get any cyclic shift of the original sequence and after an odd step you can get any cyclic shift of the reversed sequence. The only thing left is to count the number of distinct sequences among all these. Doable with suffix array.

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    You need to find how many different sequences there are from all continuous subsequences of length $$$n$$$ from $$$a + a$$$ and $$$reverse(a) + reverse(a)$$$, where $$$+$$$ stand for concatenation. You can do it either with hashes, or with suffix array.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    The answer is the size of the set containing all the rotation of the string, and all the rotation of the reversed string.

    Let $$$|P|$$$ be the shortest prefix such that $$$P \cdot k = S$$$ (here $$$P \cdot k$$$ is $$$P$$$ concatenated with itself $$$k$$$ times). The number of different rotation is equal to $$$|P|$$$. So the number of different rotations if you also take into account the reverse is $$$2 \cdot |P|$$$, but with the caveat that some strings might appear twice, once in the initial string and once in the reversed.

    If some rotation from the initial string appears as a rotation in the set of rotations of the reversed string, then both sets are the same.

    You only need to compute the length of $$$|P|$$$ and check whether $$$rev(S)$$$ appears as a substring in $$$SS$$$. $$$P$$$ can be computed in $$$O(n)$$$ using the prefix ($$$\pi$$$) function, and the latter can be checked also in $$$O(n)$$$ using KMP.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it -20 Vote: I do not like it

    Thank you for detail explanations awoo , Ra16bit , mnaeraxr !

»
3 years ago, # |
  Vote: I like it -32 Vote: I do not like it

How to solve problem C (Choose Two Subsequences) from div1 ? We thought about DP ,but we didn't manage to solve it. We got WA2. Um_nik , tourist , pashka, Ra16bit .

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    You can do $$$\mathcal{O}(|s|\cdot |t|)$$$ DP to solve the problem for each suffix of $$$s$$$ and each suffix of $$$t$$$.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Thank you Benq Benjamin for the solution .I got it . We wrote dp on prefixes . it was our mistake .