Petr's blog

By Petr, history, 7 years ago, In English
  • Vote: I like it
  • +39
  • Vote: I do not like it

»
7 years ago, # |
Rev. 3   Vote: I like it -24 Vote: I do not like it

To be honest, this spanning tree problem is a shameful entry in your glorious problemsetting record. Could have been solved by "randomize from scratch as long as you don't have everything preprocessed with minor optimizations" and basically no significantly better solution existed . It lasted pretty long what connected with my constant curiosity whether I already got 9900 or 9910 trees and few mistakes like not redirecting output to file and realizing it late caused me to waste an hour in totally not productive things (not including proper coding) which I could have spent on solving interesting problems. Such "experimental" problems are sometimes interesting but this was definitely not one.

  • »
    »
    7 years ago, # ^ |
      Vote: I like it +94 Vote: I do not like it

    Thanks for sharing your opinion!

    I will respectfully disagree, though — when I was looking for a solution myself, I had to make quite a few improvements until I got everything generated in a few minutes (I didn't try waiting for an hour). tourist has mentioned that his solution generates all answers in 1 minute, so he didn't even have to implement preprocessing — all in all I'd say that something significantly better was possible, and that getting this problem accepted in 15 minutes instead of an hour requires both intuition and skill, just like more "typical" problems.

    It seems that you took a risk by deciding to not improve the solution and just wait until it finishes, and the risk did not pay off?..

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it -38 Vote: I do not like it

      Hmmm, I remember that Gennady's solution ran for something like 10-15 mins on Marcin's laptop which is probably similar to mine, so maybe Google's onsite machines were just much faster than laptops of mere mortals. I was participating as shadow, so I used my own laptop to generate answers and generating them took me something like ~30 mins, but because of some silly mistakes I needed to run it like 3-4 times what turned out to be time consuming black hole for me. Maybe on Google's machines it would be few minutes and all my griefs would be gone?

      My optimization was to note that if I have a graph G and e is one of its edges and I know number of spanning trees in G and G\e then by some simple formula I also know number of spanning trees in all graphs such that I subdivide e k times (<=> replace e with a path of an arbitrary length). It allowed me to calculate number of spanning trees for few more graphs just by calculating determinant once more and get answers for graphs with properties that could be hard to get by randomization.

      By the way, talking about tourist's submission I already looked at it on GCJ and it seemed weird to me. It is significantly different from other solutions because he is not using randomization in any way. Here is his code: https://ideone.com/CKbI7B As far as I understand he searches graphs space by some dfs, but keeping just one state for every pair (number of spanning trees, used vertices). To me it looks fishy because I see no reason why we should discard partially built graph if we already encountered another graph with same number of vertices and spanning trees — they can produce different number of spanning trees in future. My understanding is that he just got lucky that all numbers of trees were generated by this code, but I already learnt that if I doubt in his solution it means I am wrong :P. Any opinions on that?

      • »
        »
        »
        »
        7 years ago, # ^ |
          Vote: I like it +43 Vote: I do not like it

        Why "remember" if you can just try running my solution? :) It finishes in 55 seconds on my laptop, so is your laptop actually 10-15 times slower than mine?

        Of course, my dfs optimization is heuristic. If you're trying to search the whole space of graphs on 22 vertices, there is surely no reason to apply it. If you're trying to solve the problem -- find a single graph with K spanning trees for every K -- then it's reasonable: for example, isomorphic graphs are considered just once, and graphs obtained are more diverse in general.

        • »
          »
          »
          »
          »
          7 years ago, # ^ |
            Vote: I like it +18 Vote: I do not like it

          Why "remember" if you can just try running my solution? :)

          I was on Windows when writing my post and switching to Linux takes few minutes :D. But yeah, I am finally on Linux and it turned out that your code runs 2,5 mins on my laptop (and 2 on Marcin's) which is completely fine. It seems I either misremembered or we did some mistake like not using -O2 last time, sorry for that. That said, it partially also invalidates my hopes that on Google machines running my code would take me few minutes, but x2,5 speed up would help me a lot.

          By the way now I see why your heuristic is a good one :), it "diversifies search space" while graphs I was looking at were probably just random mess with kinda similar properties. And even you were not lucky and your code would generate, say, 9990 graphs, keeping 3 graphs for every pair and adding some randomization would very likely do the thing.

      • »
        »
        »
        »
        7 years ago, # ^ |
          Vote: I like it +17 Vote: I do not like it

        Here are the improvements I remember making: 1) instead of generating a new graph randomly, we'll either add or remove an edge to the previous one. 2) in order to decide whether to add or remove, we pick a random goal from numbers we have not yet seen, and until we reach it we add when we have less and remove when we have more. This makes it more likely to hit a number we need quickly. 3) when removing, make sure the graph stays connected 4) notice that at some point all numbers except 13 and 22 are done, and manually print cycles for those.