Блог пользователя Ant_Man

Автор Ant_Man, история, 3 года назад, По-английски

How to solve B, E, F, G, H, I, J?

  • Проголосовать: нравится
  • +54
  • Проголосовать: не нравится

»
3 года назад, # |
  Проголосовать: нравится +41 Проголосовать: не нравится

F: The idea comes from Petrozavodsk Summer 2016. Day 7: Ural FU Dandelion Contest. F. Suffix Array for Thue-Morse.

Build the compacted suffix automaton of $$$fib_n$$$ and you can find some nice structures.

  • Every fibonacci word ends with ab and ba alternatively.
  • The states corresponding with $$$fib_n, fib_{n-2}, \dots, fib_{n \bmod 2}$$$ are accepting states.
  • Except with the edge in the main string, there are $$$\min(n-2,0)$$$ extra edges.
  • Every extra edge starts from some position like $$$fib_i-2$$$ and ends at $$$fib_{i+1}-1$$$. The label in the edge is determined by the parity of $$$i$$$.
  • We will never go through two extra edges consecutively.

An example for $$$fib_6$$$

       _________a__________                _________________a_______________
      /                    \              /                                 \
0-a->[1]-b->[[2]]-a->[3]-a->4-b->[[5]]-a->6-b->7-a->[8]-a->9-b->10-a->11-a->12-b->[[13]]
\_____b_____/      \__________b_______________/

The [[x]] means the accepting states, and the [x] means a fibonacci word.

It can be seen that we can only use $$$O(n)$$$ states to represent $$$fib_n$$$. And the suffix array can be obtained by walking the compacted suffix automaton. Since the maximum index is $$$10^{18}$$$, we can only keep the last $$$O(\log 10^{18})$$$ nodes.

»
3 года назад, # |
  Проголосовать: нравится +21 Проголосовать: не нравится

E: It can be seen that $$$dp_x$$$ equals to some $$$a_i+b_{i+1}+\dots+b_{x}$$$. Let the prefix sum of $$$b$$$ be $$$s$$$,

$$$ dp_x=\max_{i=0}^{x}(a_i + s_x - s_i)=s_x+\max_{i=0}^{x}(a_i - s_i) $$$

Use segment tree to maintain the value of $$$a_i-s_i$$$.

»
3 года назад, # |
  Проголосовать: нравится -48 Проголосовать: не нравится

Meh, nine <=medium problems that can be done in <=2.5h and one boring killer.

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +101 Проголосовать: не нравится

    I agree but to be fair top-5 teams seems too strong now. Problems are either <=medium or almost unsolvable. It sounds stupid but... if it can be solved, USA1 will do it in 20 minutes :)

    I'm thinking about preparing contest for Ptz and OpenCup and I'm very scared that you guys (and USA1 and Past Glory and japan02) will solve everything (maybe but 1 problem) in 3 hours and I don't really know what to do about it. I mean, if I can solve it, you'll solve it faster and better.

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +5 Проголосовать: не нравится

    A tiny idea: try to stay away from Opencup for more than 3 years and you might probably find difficulty increasing.

    Just kidding. Next time I will try to provide harder tasks for you. XD

»
3 года назад, # |
Rev. 2   Проголосовать: нравится +28 Проголосовать: не нравится

My solution for $$$H$$$ is supposed to be $$$nlogn$$$ but I have been getting TLE with it. Here's my code if someone could help me find out why I'm TLEing: https://ideone.com/AirWlG and I will explain my idea below.

Let's look at the $$$sum$$$ case first. Let $$$totalSum(m)$$$ be the sum of all hamming distances for $$$S^m$$$.

Since $$$S^m = S^{m - 1} + [m] + S^{m - 1}$$$, it's easy to see that $$$totalSum(m)$$$ can be defined recursively as $$$2 * totalSum(m - 1) + subSum(m)$$$. Where $$$subSum(m)$$$ is the sum of the hamming distances of the subarrays that contain $$$m$$$ in $$$S^m$$$ (I will explain the base case later).

Let's solve $$$subSum(m)$$$. Let $$$d$$$ be the smallest $$$d$$$ $$$s.t.$$$ $$$\lvert{S^d}\rvert >= n - 1$$$. The middle part of any $$$S^m$$$ $$$s.t.$$$ $$$ m > d$$$ will look something like this:

$$$[..., v_1, v_2, ..., v_{n - 1}, m, v_1, v_2, ... v_{n - 1}, ...]$$$

Key observation is that for any $$$m > d$$$, All $$$n$$$ subarrays we want to consider in the calculation of $$$subSum(m)$$$ will only differ in the element with value $$$m$$$. I will refer to these subarrays by their order from now on. (e.g. $$$[v_1, v_2, ..., m]$$$ is first subarray, $$$[v_1, v_2, ..., m, v_1]$$$ is the second, and $$$[m, v_1, v_2, ..., v_{n - 1}]$$$ is the $$$n^{th}$$$

Let's define $$$F[i]$$$ to be the hamming distance of the $$$i^{th}$$$ subarray $$$without$$$ the element $$$m$$$. Now $$$F$$$ is the same for all $$$m > d$$$.

Now $$$subSum(m)$$$ is $$$(\sum\limits_{i=1}^n F[i]) + n - freq[m]$$$. Where $$$freq[m]$$$ is the frequency of element $$$m$$$ in array $$$a$$$. (This is because element $$$m$$$ will be compared against event element in $$$a$$$).

We now need to compute 2 things:
(1) $$$F$$$
(2) $$$subSum(d)$$$ (To use this as a base case for our recursive definition above).

We will compute both of them in a similar manner.

Define $$$H(a, b, k)$$$ to be a function that returns a vector $$$ans$$$ of size $$$k$$$. $$$ans[i]$$$ is hamming distance between $$$a$$$ and the $$$i^{th}$$$ cyclical shift of $$$b$$$.

With some goofing around, you can see that you can use this function (or something similar) to get:
(1) $$$F$$$ = $$$H(a, [v_1, v_2, ..., v_{n - 1}], n)$$$.
(2) $$$subSum(d)$$$ = $$$H(a, S^d, length(S^d) - n + 1)$$$.
(I can explain them in another comment if someone wants to but I don't want to make this text any longer than it already is :c).

Key Observation to calculate $$$H$$$ is to see that the $$$b$$$ we send is a special array. It has at most $$$log_2(n)$$$ unique values, and the distance between two occurrences of any element of $$$x$$$ is $$$2^x$$$. You can combine these 2 observations to calculate $$$H$$$ in $$$log_2(n) * k$$$ (I believe its calculation is clear in the code but I could explain it in another comment too if someone wants).

The $$$minimum$$$ case is similar. Define $$$minHamming(m) = min(minHamming(m - 1), minSubHamming(m))$$$.

$$$minSubHamming(m)$$$ will either be $$$minF$$$ or $$$minF - 1$$$. Where $$$minF$$$ is the minimum value in $$$F$$$ calculated above. The case where it will be $$$minF - 1$$$ is when the element $$$m$$$ matches its counterpart in $$$a$$$. You can iterate over indices $$$i$$$ of element $$$m$$$ in $$$a$$$ and minimize with $$$F[n - i - 1]$$$ Since this is the only subarray that will make $$$m$$$ match in index $$$i$$$.

»
3 года назад, # |
  Проголосовать: нравится +11 Проголосовать: не нравится

Was solution for D just implication of Sylvester's determinant theorem, like $$$det(M)=x^{n}(1+\sum_{1}^{n}\frac{a_ib_i}{x})$$$? Are there some special cases?

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    There were no special cases

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +20 Проголосовать: не нравится

    (for future reference)

    "Sylvester's determinant theorem" refers to this.

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +28 Проголосовать: не нравится

    I think this theorem is overkill here. You can subtract first row multiplied by $$$a_i/a_1$$$ from all other rows, then you get matrix with zeroes everywhere except first row, first column and diagonal and this case is obvious.The only special case is when $$$a_1 = 0$$$, then you need to swap it with some nonzero $$$a_i$$$.

»
3 года назад, # |
  Проголосовать: нравится +66 Проголосовать: не нравится

Several teams (including us) missed this case on H:

7 3
1 1 2 1 3 1 2

The answer is 6 6 (there is only one orientation allowed, and it has distance 6), but it's easy to print 1 6 if you accidentally allow the shift i=2 when computing the min. It's a one or two line fix but unfortunately wasn't in the test data.

»
3 года назад, # |
  Проголосовать: нравится +18 Проголосовать: не нравится

How to solve I.. ?

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +15 Проголосовать: не нравится

    Do a straight-forward DP, like, f[i][last 3][second last 3]. Observe that generally f[i][][] and f[i + 1][][] differs by at most one row. Hence, it can be compute f[i + 1][][] from f[i][][] with O(n) efforts.

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

For J first you make an observation that if jailings intersect in interesting way then their sets of boundary cells intersect. You also note that number of boundary cells of jailing is at most linear in terms of cells with particular value corresponding to that jailing that lets you iterate through it.

First, for each cell you gather all values such that this cell lies on boundary of corresponding jailings. Then for each value you iterate through all its boundary cells and check it with all values gathered there. I guess it works in $$$O(area)$$$, but no idea why.

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    What about the test cases of the following format?

    zzzzzzz
    yyyyyyz
    xxxxxyz
    aaaaxyz
    bbbaxyz
    ccbaxyz
    dcbaxyz
    
    • »
      »
      »
      3 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      ¯\_(ツ)_/¯

      So it's $$$O((area)^{\frac32})$$$ there. Either there were no such cases or it works fast anyway since TL was quite generous.

    • »
      »
      »
      3 года назад, # ^ |
        Проголосовать: нравится +20 Проголосовать: не нравится

      I'm not sure about our proof, we had many handwave parts, but at least on this case in is linear if you distinguish between vertical and horizontal boundaries and compare only vertical with horizontal.

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +18 Проголосовать: не нравится

    We made a few big insights:

    1. The total height of all jailings is at most $$$O(area)$$$, because a connected component must have at least as many cells as its jailing height. Likewise for width.
    2. The rectangular boundaries of any two jailings cannot intersect at 4 points; it must be exactly 0 or 2 points. This is because we're building the jailings from 4-connected components of cells. This gives a way to compute the answer by considering just the edges of the jailings (you need to choose tiebreaking appropriately, e.g. by using the area of the jailing).

    Just observation 2 gives a $$$O(area * log)$$$ solution using a plane sweep, but combining observation 1 lets us directly walk the boundaries for an $$$O(area)$$$ solution (you need to process rectangles in tiebreaking order).

    One other helpful conjecture is that the total number of intersecting pairs is at most $$$O(area)$$$; does anyone have a proof/counterexample?

    • »
      »
      »
      3 года назад, # ^ |
      Rev. 3   Проголосовать: нравится +10 Проголосовать: не нравится

      To be honest I don't really understand your idea. What is the value added by 2nd observation compared to what has been said before and how do you use it to get any solution? In a case where edges of two jailing coincide they intersect in many more than 2 points. In cases when they intersect in a corner only that is only 1 point. So it's not entirely true it is 0 or 2 point. And what tiebreaking are you talking about?

      (I edited my comment a bit)

      • »
        »
        »
        »
        3 года назад, # ^ |
        Rev. 3   Проголосовать: нравится +28 Проголосовать: не нравится

        The key to avoiding these edge cases is adding a tiebreaking order so that no two coordinates are exactly equal. First, sort all rectangles so that rectangle $$$i$$$ contains rectangle $$$j$$$ only if $$$i < j$$$, e.g. by area from biggest to smallest. Let $$$R_i = [xlo_i, xhi_i) \times [ylo_i, yhi_i)$$$. Then, instead consider the modified rectangle $$$R_i' = [xlo_i + i \varepsilon, xhi_i - i \varepsilon) \times [ylo_i + i \varepsilon, yhi_i - i\varepsilon)$$$. Because our ordering goes from big to small, you can check that this modification doesn't affect which rectangles contain which, so $$$f(R_i', R_j') = f(R_i, R_j)$$$. Then, all coordinates are distinct, and our observation that the borders of 2 modified rectangles intersect at exactly 0 or 2 points (in fact, exactly $$$2 f(R_i, R_j)$$$) holds.

        Once you have these, you can treat each (modified) rectangle as two horizontal and two vertical line segments. For each rectangle, to compute $$$s$$$, you find the sum of the indices of all other segments which intersect with these (with multiplicity) and divide by 2. This is a simple plane sweep. The first observation means that you can just walk the length of the segments instead, though I think you have to do O(original length), not O(number of post-tiebreak coords) (maybe someone can prove that the latter is small?).

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

How solve G without tl issues?

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    What's your complexity?

    • »
      »
      »
      3 года назад, # ^ |
      Rev. 3   Проголосовать: нравится 0 Проголосовать: не нравится

      $$$O((n+m)\log n\log 10^8)$$$ We got AC after contest, but it took too much time to optimize it.

      • »
        »
        »
        »
        3 года назад, # ^ |
          Проголосовать: нравится +20 Проголосовать: не нравится

        It is actually possible to do it in $$$O(n log n)$$$. Let P be our query point. First you determine how long horizontal and vertical rays starting from P we can draw. They can be determined using sweeping. They give you some upper bound on the answer. If answer is smaller it is because there is another point within rectangle formed by these rays. Depending on whether this obstacle point lies above or below diagonal ray starting from P, it gives upper bound corresponding to its x or y coordinate. You can determine these using sweeping as well. Solution by mnbvmar

        • »
          »
          »
          »
          »
          3 года назад, # ^ |
            Проголосовать: нравится +10 Проголосовать: не нравится

          You can do everything in one sweep (right to left), just keeping a regular segment tree for vertical segments and a set for horizontal segments. To get the answer, go down the tree and query the set. Code.

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

What was the correct idea for B?

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

    Binary search on answer, then go left to right and greedily assign each symbol to the currently shortest sequence that want it now. It is easy to show that it is always better to append 2 to empty sequence instead of [20] and it is always better to append 0 to [2] instead of [202].

    I don't think that it is always correct, but for 2020 it is.

    • »
      »
      »
      3 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Thanks, with binary search this kind of greedy becomes reasonable and easy to understand!

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Were there wrong ideas that passed ?

    • »
      »
      »
      3 года назад, # ^ |
        Проголосовать: нравится +1 Проголосовать: не нравится

      No, I just had some harder ideas, which seemed correct, but eventually appeared to be wrong.

»
3 года назад, # |
  Проголосовать: нравится -8 Проголосовать: не нравится

How to solve problem A (Arrange And Count) from div1 ? We thought about DP ,but we didn't manage to solve it.

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

    The operation is basically: reverse the sequence and rotate it. So after an even step you can get any cyclic shift of the original sequence and after an odd step you can get any cyclic shift of the reversed sequence. The only thing left is to count the number of distinct sequences among all these. Doable with suffix array.

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

    You need to find how many different sequences there are from all continuous subsequences of length $$$n$$$ from $$$a + a$$$ and $$$reverse(a) + reverse(a)$$$, where $$$+$$$ stand for concatenation. You can do it either with hashes, or with suffix array.

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    The answer is the size of the set containing all the rotation of the string, and all the rotation of the reversed string.

    Let $$$|P|$$$ be the shortest prefix such that $$$P \cdot k = S$$$ (here $$$P \cdot k$$$ is $$$P$$$ concatenated with itself $$$k$$$ times). The number of different rotation is equal to $$$|P|$$$. So the number of different rotations if you also take into account the reverse is $$$2 \cdot |P|$$$, but with the caveat that some strings might appear twice, once in the initial string and once in the reversed.

    If some rotation from the initial string appears as a rotation in the set of rotations of the reversed string, then both sets are the same.

    You only need to compute the length of $$$|P|$$$ and check whether $$$rev(S)$$$ appears as a substring in $$$SS$$$. $$$P$$$ can be computed in $$$O(n)$$$ using the prefix ($$$\pi$$$) function, and the latter can be checked also in $$$O(n)$$$ using KMP.

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится -20 Проголосовать: не нравится

    Thank you for detail explanations awoo , Ra16bit , mnaeraxr !

»
3 года назад, # |
  Проголосовать: нравится -32 Проголосовать: не нравится

How to solve problem C (Choose Two Subsequences) from div1 ? We thought about DP ,but we didn't manage to solve it. We got WA2. Um_nik , tourist , pashka, Ra16bit .

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    You can do $$$\mathcal{O}(|s|\cdot |t|)$$$ DP to solve the problem for each suffix of $$$s$$$ and each suffix of $$$t$$$.