When submitting a solution in C++, please select either C++14 (GCC 6-32) or C++17 (GCC 7-32) as your compiler. ×

Romok007's blog

By Romok007, history, 4 years ago, In English

Hi Everyone, I am stuck in an interview problem and not finding any resources on the internet which can help me find a solution, if you provide an idea about the solution it will be of great help. The problem is as follows :

Generate a random binary search tree of 'n' nodes with equal probability.

For example : if n = 3 there are 5 possible binary search trees, our program should return any one tree with equal probability (i.e. 1/5).

  • Vote: I like it
  • +8
  • Vote: I do not like it

| Write comment?
»
4 years ago, # |
  Vote: I like it +30 Vote: I do not like it

For each $$$i = 1,2, \ldots, n$$$, you can calculate the probability that in a randomly chosen binary search tree on $$$n$$$ nodes, $$$i$$$ is the root. Use those probabilities to take a weighted random $$$i$$$ from $$$1, 2, \ldots, n$$$, and set it as the root. The nodes $$$1, 2, \ldots, i - 1$$$ will go to the left subtree and $$$i + 1, \ldots, n$$$ will go to the right subtree. Now you can recursively generate random left and right subtrees.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Will this work when n is high? Like n = 1000?

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Well, assuming infinite precision and constant arithmetic, it is $$$O(n \log n)$$$ in the average case. The number of binary search trees for all $$$n$$$ in $$$1, \ldots, n$$$ can be calculated in $$$O(n)$$$ (it's the Catalan numbers). In each level of recursion we need to evaluate $$$O(n)$$$ probabilities and there are $$$O(\log n)$$$ levels of recursion in the average case.

      I suppose it does get a bit awkward with big numbers. But $$$n = 1000$$$ should not be so bad; the 1000th Catalan number is still less than 2000 bits long.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        The expected depth of recursion is actually $$$O(\sqrt n)$$$ (Knuth, p. 15). Page 13 of the same article gives an algorithm to generate bracket sequences (and thus also binary trees) which runs in $$$O(n)$$$ time and uses numbers not larger than $$$O(n^2)$$$.

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

There is a one-to-one correspondence between rooted binary trees and rooted trees with ordered children (left child <-> left child, right child <-> right sibling). We can generate a rooted tree with ordered children uniformly at random using Prüfer codes then convert it to a binary tree.

EDIT: it doesn't work

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I dont understand about the one-to-one correspondence part. Can you please elaborate a bit?

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 3   Vote: I like it 0 Vote: I do not like it

      Given a rooted tree $$$T$$$ with ordered children, we can construct a rooted binary tree $$$B$$$ by making $$$u$$$'s left child in $$$B$$$ its first child in $$$T$$$, and $$$u$$$'s right child in $$$B$$$ the next sibling in order in $$$T$$$. Conversely, we can perform the reverse operation to turn a binary tree into a rooted tree. Thus each binary tree corresponds to one rooted tree and vice versa, so the given problem is equivalent to generating a rooted tree (with ordered children) uniformly at random, since we can convert freely between the two without losing any information.

      The Wikipedia page has some pictures that might help visualize the transformation.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Prufer codes do not help you generate rooted trees with ordered children. You need Catalan numbers for that.

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      Correct me if I'm wrong, but I think that generating a random labeled tree with Prüfer codes, choosing the root randomly, and ordering children according to their label should give a rooted tree with ordered children generated uniformly at random. I haven't proved it, but it should work (it should be the same as ordering children randomly).

      EDIT: it can't be correct since $$$C_{n-1}$$$ does not divide $$$n^{n-2}$$$.