Guidance needed with finding random graphs(trees) for testing

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

Hello!

When I need to stress-test a problem on trees, I usually find a random tree like this:

int n = random(1, 100);
for(int i = 2; i <= n; ++i){
    p = random(1,i - 1);
    addEdge(p, i), addEdge(i, p);
}

As in; to find a random tree of $$$N$$$ nodes, I just find a random sequence $$$P$$$(parents) of length $$$N-1$$$ with each $$$1<=Pi<=i-1$$$ for each $$$2<=i<=N$$$.

And it worked well, or so I believed at least.

But I remember watching an Errichto stream(I think it's the AtCoder DP contest one) where he showed how he stress tests and then said that the previous method only prints a specific kind of trees, and that he recommends using Prüfer Sequences. I kept thinking about why that's true but found no answer online.

Of course, Prüfer Sequences are clearly the best and most trusted choice since it's proved that any sequence represents a unique tree and that any tree can be represented by a sequence(more info here). And though it's not that complicated, I still don't want to keep writing it whenever I need to, unless there's an important difference between the two methods.

So if anyone would enlighten me on the difference if it exists, that would be great :D (It would be good if it's Errichto too xd)

Any other advices/tricks on stress-testing graphs or in general would be awesome!

Thankss ^^

Comments (8)

Write comment?

saketh

5 years ago, # |

← Rev. 2 →

In addition to random tests, you should specifically generate trees such as lines and stars. Random methods aren't going to produce them for large $$$N$$$, and many issues are exposed only by such trees.

One neat trick is that many solutions compute some property of an unlabeled tree but do so in a way that is influenced by labeling (for example, if you always traverse the tree from root 1). Relabeling the tree in a test case and running it again to make sure you get the same result can be an easy sanity check done with a large generated input for which you don't know the right answer.

→ Reply

GLAYS

5 years ago, # ^ |

Thankss! Relabelling does indeed seem like a very effective sanity check that could probably cover most uncovered tests without it. I'll remember to re-root every tree I find and try all other nodes from 2 to N.

nmakeenkov

← Rev. 3 →

Edit: wrong

Spoiler

Thanks!

Yes I did find that it doesn't find them all of course(even from the fact that that kind of sequence is only a special case of the Prüfer Sequences), but I failed to find something in common between them.

Errichto

Some solutions might assume something about the order of vertices and work for all trees generated by your code, but fail on something else. An example tree that you can't create is 1-3-2. An easy fix is to shuffle the names of vertices before printing.

Thanksss! It does make sense.

I guess constructing a permutation of length $$$N$$$ and permuting the nodes' IDs will do it, it will even count as doing the re-rooting trick from up top.

-is-this-fft-

+19

I can't believe this isn't mentioned yet, but here's another flaw: in your method, for any vertex the expected distance from 1 is $$$\mathcal{O}(\log N)$$$. So for example, a stupid $$$\mathcal{O}(N)$$$ LCA algorithm will work perfectly. If you want to check for TLE, you should also try some kind of "longer" trees.

MrDindows

+16

Using different distributions for parent vertex p = random(1,i - 1); you may get different types of trees. For example you can choose parent among previous K vertices or generate several values and take maximum to get longer trees.

GLAYS's blog