Блог пользователя GLAYS

Автор GLAYS, история, 4 года назад, По-английски

Hello!

When I need to stress-test a problem on trees, I usually find a random tree like this:

int n = random(1, 100);
for(int i = 2; i <= n; ++i){
    p = random(1,i - 1);
    addEdge(p, i), addEdge(i, p);
}

As in; to find a random tree of $$$N$$$ nodes, I just find a random sequence $$$P$$$(parents) of length $$$N-1$$$ with each $$$1<=Pi<=i-1$$$ for each $$$2<=i<=N$$$.

And it worked well, or so I believed at least.

But I remember watching an Errichto stream(I think it's the AtCoder DP contest one) where he showed how he stress tests and then said that the previous method only prints a specific kind of trees, and that he recommends using Prüfer Sequences. I kept thinking about why that's true but found no answer online.

Of course, Prüfer Sequences are clearly the best and most trusted choice since it's proved that any sequence represents a unique tree and that any tree can be represented by a sequence(more info here). And though it's not that complicated, I still don't want to keep writing it whenever I need to, unless there's an important difference between the two methods.

So if anyone would enlighten me on the difference if it exists, that would be great :D (It would be good if it's Errichto too xd)

Any other advices/tricks on stress-testing graphs or in general would be awesome!

Thankss ^^

  • Проголосовать: нравится
  • +13
  • Проголосовать: не нравится

»
4 года назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

In addition to random tests, you should specifically generate trees such as lines and stars. Random methods aren't going to produce them for large $$$N$$$, and many issues are exposed only by such trees.

One neat trick is that many solutions compute some property of an unlabeled tree but do so in a way that is influenced by labeling (for example, if you always traverse the tree from root 1). Relabeling the tree in a test case and running it again to make sure you get the same result can be an easy sanity check done with a large generated input for which you don't know the right answer.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Thankss! Relabelling does indeed seem like a very effective sanity check that could probably cover most uncovered tests without it. I'll remember to re-root every tree I find and try all other nodes from 2 to N.

»
4 года назад, # |
Rev. 3   Проголосовать: нравится 0 Проголосовать: не нравится

Edit: wrong

Spoiler
  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

    Thanks!

    Yes I did find that it doesn't find them all of course(even from the fact that that kind of sequence is only a special case of the Prüfer Sequences), but I failed to find something in common between them.

»
4 года назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

Some solutions might assume something about the order of vertices and work for all trees generated by your code, but fail on something else. An example tree that you can't create is 1-3-2. An easy fix is to shuffle the names of vertices before printing.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Thanksss! It does make sense.

    I guess constructing a permutation of length $$$N$$$ and permuting the nodes' IDs will do it, it will even count as doing the re-rooting trick from up top.

»
4 года назад, # |
Rev. 3   Проголосовать: нравится +19 Проголосовать: не нравится

I can't believe this isn't mentioned yet, but here's another flaw: in your method, for any vertex the expected distance from 1 is $$$\mathcal{O}(\log N)$$$. So for example, a stupid $$$\mathcal{O}(N)$$$ LCA algorithm will work perfectly. If you want to check for TLE, you should also try some kind of "longer" trees.

»
4 года назад, # |
  Проголосовать: нравится +16 Проголосовать: не нравится

Using different distributions for parent vertex p = random(1,i - 1); you may get different types of trees. For example you can choose parent among previous K vertices or generate several values and take maximum to get longer trees.