Блог пользователя SecondThread

Автор SecondThread, история, 20 месяцев назад, По-английски

Meta Hacker Cup Qualification Round Editorial

Good morning!

As usual, written editorials are available in the solution section of all problems on the Hacker Cup site. I thought this post might also be useful to see what people thought of the problems.

A: Second Hands

Feedback

B1/B2: Second Friend

Feedback

C1/C2: Second Meaning

Feedback

D: Second Flight

Feedback

So, what'd you think of the problems?

  • Проголосовать: нравится
  • +225
  • Проголосовать: не нравится

»
20 месяцев назад, # |
  Проголосовать: нравится +73 Проголосовать: не нравится

Thanks for the contest! I posted my video of winning 1st place along with solution explanations here: https://www.youtube.com/watch?v=wh3pz18qC70

»
20 месяцев назад, # |
  Проголосовать: нравится -27 Проголосовать: не нравится

anyone else who found C1 statement complicated or only me

»
20 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

My two solutions A,B1 got accepted. Did I qualify for the next round?

»
20 месяцев назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

What's up with the afro, palette, and weird voice in second friend?

»
20 месяцев назад, # |
  Проголосовать: нравится -13 Проголосовать: не нравится

The checker evaluated my B2 solution on the wrong input file... I checked the "differing" outputs and they were of totally different dimensions. I tried to submit for practice and got a different input file, I was unable to upload (again), but my friend's AC code generated the exact same output as mine. Please make sure this does not happen in future rounds.

»
20 месяцев назад, # |
  Проголосовать: нравится +23 Проголосовать: не нравится

why Meta can't simple take the source code and do online judging. for many downloading and uploading txt files is too cumbersome and I don't see any point in doing that. just to keep the format distinct from other contest, they do so.

»
20 месяцев назад, # |
  Проголосовать: нравится -16 Проголосовать: не нравится

Do we need to manually register for round 1?? Btw problem D and B2 were nice

  • »
    »
    20 месяцев назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    You'll be registered for round 1 after we complete our plagiarism checks on all submissions.

»
20 месяцев назад, # |
  Проголосовать: нравится +24 Проголосовать: не нравится

Will the large file submission error issue be fixed before the next round?

If the code did not finish in 6 minutes on local execution, but is fast enough in the MHC judge, they may be able to get AC that they could not originally get by sending it in clar.

On the other hand, what if someone claims the lie that O(N^2) was fast even if N=200000 on local execution? Can they get an unfair AC too ?

I am concerned that this could happen, as I think it is a bit unfair.

  • »
    »
    20 месяцев назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    If code you send in a clar doesn't run in <6m for them you don't get an AC. (checked on practice, though a person that sent that code had it running much faster than 6m locally)

    • »
      »
      »
      20 месяцев назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Thanks.

      On the other hand, a person was able to get AC by sending in clar because was fast enough in the MHC judge even though it was taking longer than 6 minutes to execute locally.

      Umm...

      • »
        »
        »
        »
        20 месяцев назад, # ^ |
          Проголосовать: нравится +8 Проголосовать: не нравится

        Well, it seems mhc judge is slower than average pc, so maybe that isn't that unfair...

  • »
    »
    20 месяцев назад, # ^ |
      Проголосовать: нравится +18 Проголосовать: не нравится

    Will the large file submission error issue be fixed before the next round?

    Yes. Certainly we will adjust the problems so that output in all future rounds is much smaller if we don't have a technical solution.

    On the other hand, what if someone claims the lie that O(N^2) was fast even if N=200000 on local execution? Can they get an unfair AC too?

    Well if it isn't truly fast, they won't have the output 6 minutes later, so they won't be able to upload the solution in time. For very important decisions like who makes it to the finals, we'll also manually check people's code as well to make sure that everything is done legitimately.

    • »
      »
      »
      20 месяцев назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      I got it. Thank you. I'm looking forward to the next round!

    • »
      »
      »
      20 месяцев назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      Are input files going to be smaller too?

      After 5 failed attempts to download input data for task D, I managed to download it, but my computer completely crashed while unzipping the file. Haven't seen anyone have the same issue so i don't know if the file was corrupted because of bad connection, my 4-5 year old laptop can't handle that large files or if it is maybe something fixable.

      • »
        »
        »
        »
        20 месяцев назад, # ^ |
          Проголосовать: нравится +8 Проголосовать: не нравится

        This is making me a bit interested. What laptop (model,brand) do you have?

        • »
          »
          »
          »
          »
          20 месяцев назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          It's a Dell Vostro 3578, it should absolutely be able to unzip that file and it managed to unzip much larger files before so I have no clue why it crashed now.

      • »
        »
        »
        »
        20 месяцев назад, # ^ |
        Rev. 2   Проголосовать: нравится +6 Проголосовать: не нравится

        We hadn't heard many [any other than this that I remember] reports of input files being too large and causing issues. I wasn't planning on being too concerned about input file sizes. We likely could make them smaller, but the issue with doing so is that it would require you to implement generators for the input, which are generally unpopular because they're annoying and make the statement longer.

»
20 месяцев назад, # |
  Проголосовать: нравится -9 Проголосовать: не нравится

why tf I did not get notification from fb about contest starting. I registered and forgot about dates

  • »
    »
    20 месяцев назад, # ^ |
      Проголосовать: нравится +11 Проголосовать: не нравится

    +1 to that, I haven't received any email, even though on my profile I have allowed for such emails.

»
20 месяцев назад, # |
  Проголосовать: нравится +30 Проголосовать: не нравится

Hi,

Out of curiosity, how does the checker work for Problem C?

Thanks!

»
20 месяцев назад, # |
  Проголосовать: нравится +13 Проголосовать: не нравится

Due to a mistake on my part, I didn't check for duplicate queries in D. (used a and b for queries and thought that the last restriction gave me a free pass)

I wanted to ask why have you designed the problem this way, caching has nothing to do with the actual solution, and testing against it just feels cheap, even if I get that dupes may happen in a real world scenario.

  • »
    »
    20 месяцев назад, # ^ |
    Rev. 2   Проголосовать: нравится +86 Проголосовать: не нравится

    In my opinion realizing that caching the answer can change the asymptotic complexity from $$$O(QN)$$$ to something better (in my case I think it was $$$O(Q\sqrt N)$$$ or so...) was an important part of the task.

    • »
      »
      »
      20 месяцев назад, # ^ |
        Проголосовать: нравится +6 Проголосовать: не нравится

      I understand what you say, but I would argue that making an observation (i.e. rearranging $$$a$$$, $$$b$$$ in any query to have $$$deg(a) \leq deg(b)$$$) would be much more important in improving the complexity, while just caching won't modify a solution (at least mine) from $$$O(QM)$$$.

      • »
        »
        »
        »
        20 месяцев назад, # ^ |
        Rev. 6   Проголосовать: нравится 0 Проголосовать: не нравится

        I don't understand. You claim caching will not improve your complexity from $$$O(QM)$$$. So you found another solution which didn't require caching to pass? That's great! What are you complaining about? Edit: Ah, I think you wanted to point out, that caching alone won't solve the problem? Well, yes, it won't. But both observations are needed to solve it. I think it is really unfortunate though, that you misread the problem statement, with regard to duplicated queries!

        In my solution, both, caching and making sure that $$$deg(a) \le deg(b)$$$ were equally important observations. Without either of them, the complexity was too high (I even tested this after the contest, because I too was surprised about the simplicity of the solution. Removing either of them my program just would not finish.). My solution. The second editorial solution even lists those two optimizations.

        For me calculating the complexity before submission was a bigger part of the problem than implementing it.

        • »
          »
          »
          »
          »
          20 месяцев назад, # ^ |
          Rev. 2   Проголосовать: нравится +8 Проголосовать: не нравится

          I agree, calculating the complexity was much more difficult for me too. I just thought it would be around $$$O(M \sqrt M \log M)$$$. The way my code is written, I don't know if it can be easily proven. (i.e. $$$\sum_{(a, b) query} deg(a) \leq \, ?$$$)

          It's true, both observations count, I think I am just salty about it..

        • »
          »
          »
          »
          »
          20 месяцев назад, # ^ |
            Проголосовать: нравится -18 Проголосовать: не нравится

          To me caching doesn't seem like an observation at all. It's obvious you need to do that(once you notice you can have duplicate queries of course, I was debugging my solution for hours because I didn't notice that).

          • »
            »
            »
            »
            »
            »
            20 месяцев назад, # ^ |
            Rev. 3   Проголосовать: нравится +18 Проголосовать: не нравится

            I guess once you reach some level many things start to be obvious. And for different people you have different sets of "many things".

            I agree, it is obvious in query-tasks with a fixed state, that caching the answer won't make your solution worse. But oftentimes caching is not needed to achieve the complexity you want.

            Here it is needed, because some edgecases have bad complexity. Once I thought those edgecases through, for me it was also clear/"obvious" that after caching you are only left with cases which have $$$\sqrt N$$$ edges in the worst case. But the first 15 minutes looking at that task it was totally not obvious for me. I was looking for solutions with good complexity without caching. I wasn't confronted with such tasks to this point. And I feel like the upvotes on my first comment are people who agree, that it wasn't obvious for them too.

            Old People anecdote

            And I can feel you, debugging for hours just to then realize that one of your assumptions about the statement was wrong is very annoying. And I totally think, the moment you noticed that you could only think about "What is this annoying bullsh*t" and we all had those moments. But tell me, if you hadn't had this mistaken assumption, would you have though about caching (and the implications of caching on the complexity) right away after reading the task or only after analyzing the problem and its edgecases?

            • »
              »
              »
              »
              »
              »
              »
              20 месяцев назад, # ^ |
                Проголосовать: нравится +3 Проголосовать: не нравится

              Idk, you don't solve this problem as a problem with duplicate queries. If you see that a problem can have duplicate queries you just solve it as if cannot have them. Since it's always trivial to add caching. Repeating queries just add an extra level of implementation, they don't add any extra thought.

              • »
                »
                »
                »
                »
                »
                »
                »
                20 месяцев назад, # ^ |
                  Проголосовать: нравится 0 Проголосовать: не нравится

                That's what I wanted to point out. Btw, did you figure out duplicate queries just from the pretests? Mine ran too quick to even suspect it.

»
20 месяцев назад, # |
Rev. 2   Проголосовать: нравится +6 Проголосовать: не нравится

Hey, for some reason I'm getting this error when I try to submit for problem B2. ("Oops, an error occurred on our end while processing your submission. Please try again.") I'm not sure why. I don't have this problem for B1 even though I'm using the same code... Any idea what causes this? I got this error in contest as well.

»
20 месяцев назад, # |
Rev. 2   Проголосовать: нравится +29 Проголосовать: не нравится

Btw for problem D, instead of a massive LP setup, solution 2 can be proved using a similar method to solution 1 (which also shows how the two solutions are related).

Consider processing a query between $$$(x, y)$$$, WLOG $$$\deg(x) \leq \deg(y)$$$, so our algorithm will process a query in either constant time if it was already memoized, or $$$\mathcal O(\deg(x))$$$ otherwise. Choose some threshold $$$B$$$. We split into two cases:

Case 1: $$$\deg(x) < B$$$. In the worst case, this query appeared for the first time and wasn't memoized, so it gets processed in $$$\mathcal O(B)$$$ time. In total, we have $$$\mathcal O(QB)$$$.

Case 2: $$$\deg(x) \geq B$$$. There can only be at most $$$\mathcal O(\frac{M}{B})$$$ such vertices, which we will call big vertices. Consider iterating over $$$\deg(x)$$$. How many times can this happen? Since $$$\deg(y) \geq \deg(x)$$$, vertex $$$y$$$ must also be one of the $$$\mathcal O(\frac{M}{B})$$$ big vertices. So the total work is bounded by $$$\mathcal O(\sum_{x \in \text{big vertices}} \deg(x) \cdot \frac{M}{B}) = \mathcal O(\sum_{x=1}^N \deg(x) \cdot \frac{M}{B}) = \mathcal O(M \cdot \frac{M}{B})$$$.

Choosing our threshold for analysis as $$$B = \frac{M}{\sqrt Q}$$$ gives us the same runtime of $$$\mathcal O(QB + \frac{M^2}{B}) = \mathcal O(M \sqrt Q)$$$.

From this analysis, we can see that solution 1 is just solution 2, but instead of memoizing, we do all the worst case precomputation before answering any queries. It's just a difference in where we do the precomp work for query pairs of two big vertices.

EDIT: I realized the editorial analysis for solution 1 considers precomputing all queries containing big vertices instead of just big-big pairs. Luckily, it's basically the same analysis. Tweak case 2 to process all $$$\deg(y) \geq B$$$, regardless of $$$\deg(x)$$$, and have case 1 only process $$$\deg(y) \leq B$$$, which implies $$$\deg(x) \leq B$$$ as well.

»
20 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I love the B2 editorial

»
20 месяцев назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

Overall, I enjoyed the contest. But I don't understand why didn't or can't you solve the problem about failing to submit B2? I tried to submit it in the last 4 hours expecting that problem should be solved, but the problem still occurred. I think you should fix the problem between the contest so the contestant later don't find the same problem. Or were there technical reason why you can't fix it? If yes, can you tell it if possible?

  • »
    »
    20 месяцев назад, # ^ |
      Проголосовать: нравится +2 Проголосовать: не нравится

    It's just common sense. The contest was running. People could qualify the way it was, so there was no critical urgency to fix anything. While trying to tweak a running system always has a non-zero chance of introducing some nasty unexpected fatal regression, ruining the whole contest for many participants.

    I'm sure that they will make an effort to fix all discovered problems before the start of the next round.

»
20 месяцев назад, # |
  Проголосовать: нравится +46 Проголосовать: не нравится

Here is a video of solving a round in Kotlin. That was the first time I solved with commentary, so there are a lot of issues with video but I hope it still would be interesting to someone.

»
20 месяцев назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

Thanks for the Second Meta Cup <3

»
20 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

When the problems will be available for practice?

»
20 месяцев назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

I'd like to share another offline solution for D. Instead of caring about query nodes, we focus on the intermediate ones in the 2-step path. Let $$$K$$$ be the threshold and call node $$$i$$$ big iff its degree $$$deg_i > K$$$.

  • Small nodes: iterate through all pairs of its neighbors and update the answer if there exists some queries for such pair. Using a hash map to look up queries, we can achieve a complexity of $$$O(deg_i^2)$$$ for each small node. In total, this takes $$$\sum{deg_i^2}$$$ for all small nodes, which is equal to $$$O(MK)$$$.
  • Big nodes: there are at most $$$\frac{2M}{K}$$$ such ones, we can simply iterate through the list of big neighbors for each query. This takes $$$O(\frac{M}{K})$$$ for each query.

So our total complexity is $$$O(MK + \frac{QM}{K})$$$ and the optimal $$$K$$$ is around $$$\sqrt{Q}$$$.

  • »
    »
    20 месяцев назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I did exactly this and TLEed. My solution was running for like 8m at the time I halted it.

»
20 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Thanks for the contest! I liked B2 and C.
B1 is slightly easier version of recent topcoder problem. Even the legends are almost the same :)

»
20 месяцев назад, # |
Rev. 2   Проголосовать: нравится +3 Проголосовать: не нравится

Here's a write-up on various approaches to tackle Problem D:

https://techvineyard.blogspot.com/2022/09/meta-hacker-cup-2022-qualification.html

It's interesting that the 2 simple optimizations on the online query approach reduce O(M*Q) runtime so easily.