I have a non-binary tree.
I want to randomly find a “sub-trees” that are connected from root to leaf which must have at least n leaf-nodes (leaf-nodes of sub-tree must be leaf-nodes of given tree).
For example: given a tree as below
*A
/
B C
/
*E D *F
*J
n = 2
A random sub-tree that contains 2 leaf-nodes could be:
A
/
B C
/
E F
or
A
/
B C
D F
J
or
A
/
B
/
E D
J
I don’t need to get all possible sub-trees, just randomly get it with a given tree and n
2
Randomly choose n leaf nodes. Return the tree that contains those leaf nodes and their ancestors.
Whenever you say “random”, you should specify what probability distribution you want. Unless you nail down the probability distribution, any of these “solutions” will meet your needs:
- Always pick the leftmost n leaves.
- Always pick the rightmost n leaves.
- Always pick the middle n leaves.
- Visit all the leaves in some desired order. Put each one at the end of a list. If the list now contains more than n nodes, delete one of them at random.
- Make a list of all the leaves, and then repeatedly discard one at random until there are only n left.
- Choose a random number from 1 to 5. Use the corresponding method from the above list.
Notice that the word “random” appears several times in the above. Each of these instances of “random” should be qualified by its probability distribution.
But especially notice that randomly picking n items from a larger set is a completely separate sub-problem that has nothing to do with trees. First you solve that, then you come back to the trees domain to build the tree that has your chosen leaf nodes.
Try solving each of these sub-problems. I think you’ll find that neither is terribly difficult, once you put the other sub-problem temporarily out of your mind.
Otherwise, if you’re trying to build the tree while you’re picking the leaves to go into it, you’re going to drive yourself mad.
2