chenjb's blog

By chenjb, history, 4 years ago, In English

Hello! I'm happy to announce XX Open Cup: Grand Prix of Nanjing, which will be held on May 3rd, 2020.

This contest is mainly prepared by gisp_zjz and me, which has already been used in Moscow Pre-Finals Workshop 2020 Day 5 on April 29th, 2020.

As you can see on the top of the statement, this contest is also been called Legilimens+Coffee Chicken Contest. Team Coffee Chicken from Nanjing University will participate in World Finals 2020 and we wish them best luck in Moscow. Team Legilimens is from Zhejiang University, which is my team in the past 3 years. Although we didn't get advanced to World Finals 2020 and this is the end of our journey, we have experienced a lot of happy and unforgettable moments. Therefore, please allow me to express my gratefulness to my teammates Subconscious and oipotato. And may the friendship between Nanjing University and Zhejiang University last forever.

Authors of this contest come from Competitive Programming Team members from both Zhejiang University and Nanjing University, including gisp_zjz, AceSrc, zyb, Roundgod, calabash_boy, sy_chen, chenjb, Subconscious, oipotato, Sugar_fan, etc.

Testers: mayaohua2003, gtrhetr, xiaowuc1. Thank you!

Contest link: http://official.contest.yandex.ru/opencupXX/contest/18242/enter (Only visible for users with OpenCup login)

I will post the editorial after the contest. We are looking forward to your participation!

UPD: Now it is over. Please feel free to discuss solutions and this is the Editorial Sketch.

UPD2: Now it is in the gym.

  • Vote: I like it
  • +124
  • Vote: I do not like it

| Write comment?
»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by chenjb (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by chenjb (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by chenjb (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it +18 Vote: I do not like it
  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Um_nik told me this bubble cup problem after the pre-final contest. I camp up with this problem from a problem called Constellation in JOI2011-2012 around 2015-16 when I am in high school. So SAD...

»
4 years ago, # |
  Vote: I like it +37 Vote: I do not like it

Where is the editorial?

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How to solve B,F,G, K?

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +20 Vote: I do not like it

    B: lengths of borders can be grouped in $$$O(\log(n))$$$ arithmetic progressions. When you add a new character, go through them from longest to shortest. If you need to delete element, do it naively, if you don't delete the first 2 elements in arithmetic progression, you don't need to delete any elements in it, so skip it.

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 2   Vote: I like it +66 Vote: I do not like it

      Simpler solution for B: if a certain length stops being a period, it will never become one again. So we can do independent binary searches for each length, and use hashes to compare.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +10 Vote: I do not like it

        Well, I guess it depends on one's taste. For me arithmetic progressions idea seems very natural and the code is pretty simple, while anything involving hashes and binary search makes me vomit.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it +31 Vote: I do not like it

          Indeed. I don't understand strings well enough to see that lengths of borders have that structure :)

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Can you share the code?

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Actually, we know that for a repetition of length K is existing in a continuous period [L,R]. L is obviously equal to K. For R, we can binary search it and check with Hash or Suffix Array. Precalculating it for repetitions of all length separately.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +10 Vote: I do not like it

      Well, actually you can observe that if some string has some border, then the string shorter by one has a border shorter by one. So, for each string you can look at its border of length $$$1$$$ and check for how long will it be the border of the next strings. So just do binary search and compare two substrings with hashes or something.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Solution to G was discussed in 300iq's chat a few days ago.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +16 Vote: I do not like it

      So, after the onsite? Interesting

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +16 Vote: I do not like it

        No, it was on April 28th, one day before it, lol.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I am in that chat too and it indeed made me a little awkward since I don't know whether they are discussing because of my problem before the opencup...Maybe just a coincidence.

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    G: n^4 is a simple knapsack, right? In order to make it faster I do "knapsack without one element" in D&C manner in O(n^3 log n).

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I'm surprised it passes. I meant the other solution, which subtracts dp values to get knapsack without one element in $$$O(n^3)$$$.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +26 Vote: I do not like it

        I'm not in the elite chat, but the problem and the trick are very similar to the TopCoder one from this post.

      • »
        »
        »
        »
        4 years ago, # ^ |
        Rev. 2   Vote: I like it +30 Vote: I do not like it

        That solution is the reason we didn't won NEERC 2019, so I stared at the problem for half an hour and then wrote $$$O(n^3 \log n)$$$ Swistakk described because it only uses additions. Can you prove that your solution is numerically stable?

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I feel like "Can you prove that your solution is numerically stable?" is a rhetoric question that will never get answered.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Obviously I can't prove that since it doesn't work on tests from that NERC problem. :)

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            Judging by the editorial, authors also didn't think about precision issues.

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              Quoted from the author: 'About the precision issue when subtracting, we ran stress test with O(n^4) brute-force and it seems no effect. Meanwhile, because we are calculating the possibility instead of the number of ways, so undoing one item won’t affect much. But if we store the number of ways, we may need long double.'

              Well, it is still possible that we didn't generate bad enough tests...

              • »
                »
                »
                »
                »
                »
                »
                »
                4 years ago, # ^ |
                  Vote: I like it +18 Vote: I do not like it

                This generator breaks my solution with subtractions consistently.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                  Rev. 7   Vote: I like it +13 Vote: I do not like it

                  (Modified a bit since the test at the beginning was problematic...)

                  Run a test with your generator on USA1's, STD and Polish Mafia's and yours...

                  The result is that only Polish Mafia's solution survived...

                  I have to say that I still kind of agree that if we calculate probability instead of number of ways, the precision shouldn't be an issue. But maybe this test proved it wrong.

                  So, the official solution should be that Divide and Conquer solution with the time complexity of O(n^3logn). Glad that the TL allows it to get passed.

                  We are still trying to fix the std, but anyway I think either we should avoid any issue with subtraction or we shall make it a problem with modulo next time...

                  Thank you very much!

                  UPD: Some solutions with subtraction survived. Investigating.

                  UPD2: aid hacked them too. You are so cool! btw, ksun48 provides a tricky method: Calculating the probability of losing as well as winning and pick the reasonable one. Very funny but maybe also very reasonable~

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +18 Vote: I do not like it

                  But how do you know which one is more reasonable? :P

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 years ago, # ^ |
                    Vote: I like it +10 Vote: I do not like it

                  The one that lies in range $$$[0, 1]$$$ is obviously more reasonable. But it's possible to hack both solutions with the same test.

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 2   Vote: I like it +30 Vote: I do not like it

      There is another $$$\mathcal{O}(n^3)$$$ solution. I_love_Tanya_Romanova came up with it during the contest.

      We can do two separate DPs.

      • $$$dp_0(pos, cnt, s_{got})$$$ is the number of ways to choose $$$cnt$$$ elements from $$$x_0...x_{pos-1}$$$ with sum equal to $$$s_{got}$$$

      • $$$dp_1(pos, cnt, s_{remains})$$$ is the number of ways to choose $$$cnt$$$ elements and one more special (last) element from $$$x_0...x_{pos-1}$$$, in such a way, that $$$s_{remains} +$$$ $$$\mathit{the\ sum\ of\ chosen\ elements}$$$ is good

      There are simple transitions inside of each DP (constant number of transitions per state). And from each state in $$$dp_0$$$ there should be a transition to an interval in $$$dp_1$$$:

      $$$dp_0(pos, cnt, s_{got}) \rightarrow dp_1(pos + 1, cnt, (a-s_{got}-x_{pos}, b-s_{got}-x_{pos}])$$$

      We can avoid subtractions of real values by using the Queue to add on interval (queue implemented on two stacks).

      code: 79081723

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    K: for requests with x>sqrt(n), iterate over depths d(a)+y, d(a)+y+x, ... and maintain a separate data structure for each depth (vertices go in the dfs order). For requests with x<sqrt(n), maintain a separate data structure for each (x,y) pair (vertices go in the dfs order). In both branches, we need an "imbalanced" data structure which has O(n) updates and O(n*sqrt(n)) queries, or vice versa. This can be done by splitting into sqrt(n) blocks (instead of using a Fenwick tree which we got TLE with), so that the less frequent operation is done in O(sqrt(n)), and the more frequent in O(1).

    EDIT. And to avoid getting MLE, for requests with x<sqrt(n), we iterate over x on the outside so that we only O(n) memory.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Are you saying that you had two separate implementations of (update, query), one in (O(sqrt n), O(1)) and the other in (O(1), O(sqrt n)) ?

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        We didn't, we tried to squeeze Fenwick and couldn't. But I think the intended solution does.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Anything faster than $$$O(n \sqrt{n})$$$ in K? We've squeezed $$$O(n \sqrt{n} \log(n))$$$, I see that it can be implemented without $$$\log$$$ but it starts looking bad.

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    On the workshop we didn't have any troubles with $$$O(n\sqrt{n} log(n))$$$ solution, it passes in half of TL. Code.

    UPD: I just checked in opencup's Y.C, it is still half TL.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I had TL with almost exactly the same code on Open Cup :(

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Well, I guess it's not exactly the same. Let me try to understand the difference :) Our code that gets TL.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          So my solution passes after borrowing just one trick from your solution: don't do binary search in the branch at line 155 (in your solution), instead precomputing the "pos" array: https://pastebin.com/Tw6BRMWq. Wow.

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it +20 Vote: I do not like it

            And 5 more seconds are saved by this line:

            if (rr == ll) break;
            

            After those two fixes my solution also runs in half TL.

            I guess even with C++ I need to learn to squeeze better :)

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    O(n^1.5) is the intended solution just as Petr said. We set the TL to make it difficult to let O(n^1.5logn) pass while O(n^1.5) won’t need much adjustment. I didn’t notice KAN pass it with a O(n^1.5logn) and the TL hasn’t changed in the OpenCup as far as I know.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    I sent a solution with centroid decomposition, with less effort to model. blog

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by chenjb (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How to solve Problem O(Official Visit) ? How we can construct path ,so it will be maximum?

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

What is i in Sx(i) in the editorial of problem D?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I think it refers to the start index of Sx.

»
4 years ago, # |
Rev. 2   Vote: I like it +10 Vote: I do not like it

Was solution to M inspired by "divide and color" approach to finding k-path in $$$O^*(4^k)$$$? Solution in current state is kinda disappointing since if you didn't read, it works for k up to 10 only, maybe you can make it work for bigger values, but that would get messier since it basically handcrafts a different algorithm for every value. However if you apply that recursively (details to be figured out by careful reader) you will get $$$O(4^k poly(n, m))$$$ algorithm that is satisfying from theoretical point of view.

We got good ol' color coding in $$$O^*((2e)^k)$$$ and it really barely doesn't pass (we managed to get to test 47)... Seems constraints were tailored exactly for this particular solution to be the best one. If only TL would have been a little bigger or it would have been path instead of a cycle or if constraints would have been a bit different... And, what is more, I thought about divide and color approach during the contest as natural next step from optimizing simple color coding, but I never really made an effort to understand it and expected it to be too complicated to be intended solution, so I didn't bother to check it during contest (and indeed it was too complicated, but for k<=10 it simplifies as in editorial).

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +30 Vote: I do not like it

    Yes, you've got that right.

    I'm one of the coauthors of Problem M. I'd like to share some stories behind the scene.

    When we made up this problem set, we wanted to add a randomized problem simply cuz random & approx algorithms are the main research fields our TCS group. Then we just come up with color coding, which is not yet widely known in competitive programming community.

    Before this problem was made, there was a problem another contest (2019 China Multi-University Training Contest) asking to find k-path. In that problem, a simple $$$O^*((2e)^k)$$$ color coding could pass, but during that contest many participants passed it with various heuristics. So we just wanted to make another color coding problem, which couldn't be easily passed by heuristics. This is M in this contest.

    The basic idea of M indeed came from the $$$O^*(4^k)$$$ algorithm for finding a k-path by combining color coding and divide-and-conquer. The constraint $$$k \leq 10$$$ is mainly because of two reasons: one is we can just split to two 5-paths and perform a case analysis, thus simplifying the solution; the other is that it provides enough budget for the constraints on the # of vertices and edges.

    To avoid heuristic solutions, we made it to find k-cycle and carefully tuned the constraints. We successfully hacked many common heuristics (among which the most difficult one to hack is to precompute a "relaxed" answer which allows cycles with repeated vertices and use this information to prune during exhaustive search). Actually there is still one solution we haven't hacked: compute biconnected components and use the aforementioned heuristic on each component. But we thought nobody would come up with such solution during contest. :P

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +20 Vote: I do not like it

    Yeah, we were able to pass M with color-coding with a couple constant-factor optimizations:

    1. When searching for P points it's actually ideal to assign P+1 or P+2 colors rather than P -- traveling salesman is 2x slower but probability of success is quite a bit higher.
    2. Instead of initializing with a vertex and searching for the remaining K-1 points, you can initialize with an edge and search for the remaining K-2 points. Since M and N have the same bound each edge is just as likely to be part of a correct set.
    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Second idea (with starting from an edge) seemed to me like a great one, but unfortunately it has not made our algorithm better. Even though we need to use one color less, it is outweighed by the fact that if we delete a vertex we gain much more than if we delete an edge (we considered starting vertices in descending order of their degrees and removed them after that) which is especially painful on a clique on 25 vertices which I think was already a worst-case for our algorithm.

      However the optimization with 2 more colors seemed to do the trick and we managed to get it accepted with it. I think that at some point in my life I wondered what happens if we allow more colors in this algorithm, but did this with focus on complexity rather than constant optimization and got no results and forgot about it.

»
4 years ago, # |
  Vote: I like it -75 Vote: I do not like it

Why are you discussing a private contest in a public place? I hate this. I think it's unfair for users who don't have an OpenCup login.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +50 Vote: I do not like it

    It is not good that it is hard to get OpenCup login. But your logic is weird. Many people participate in OpenCup, so this blog definetely has an audience on CF. Moreover, CF is a go-to place for any cp related discussions. There are a lot of local contests (for example ICPC regionals) which are also not open for everyone and are discussed on CF. Is that bad too?

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 3   Vote: I like it -30 Vote: I do not like it

      In most cases, the ICPC contests will be published anyway(virtual contest, or on Gym, or on some official websites), but the contests for OpenCup won't(some will be on Gym, but just depend on writers). That's the difference. I think the discussion for contests(shared with everyone) is good, but for some contests(people without login have no access to the resources) is really annoying.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +41 Vote: I do not like it

        Sorry, but we discuss the problems not because we want to share knowledge with you, but because we are interested in different approaches.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Any better explanation for C?

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by chenjb (previous revision, new revision, compare).