Skip to content

Shortest subtree that contains half the nodes of a population #1343

Answered by petrelharp
stsmall asked this question in Q&A
Discussion options

You must be logged in to vote

Trees already keep track of the number of samples below each node (and efficiently update this), so you could do something like:

ts = msprime.sim_ancestry(2, recombination_rate=0.1, sequence_length=40)
n0 = ts.num_samples / 2
for t in ts.trees():
    tmrcah = min([t.time(n) for n in t.nodes() if t.num_samples(n) > n0])
    print(f"TMRCAH on {t.interval} is {tmrcah}")

or maybe more efficiently

for t in ts.trees():
    tmrcah = np.inf
    for n in t.nodes(order='timeasc'):
        if t.num_samples(n) > n0:
          tmrcah = t.time(n)
          break
    print(f"TMRCAH on {t.interval} is {tmrcah}")

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@stsmall
Comment options

Answer selected by stsmall
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants