Replies: 6 comments
-
Hm, let's see. Good questions. To see what's going on here, consider what we'd have to do to add fixed mutations to a tree sequence: how many should we add? There are an unbounded (infinite?) number of "fixed mutations" in the history of any sample, if you look back far enough in time, so we clearly need to say how far back we want to add those fixed mutations for. Since msprime adds mutations to trees, more specifically to branches of trees, that's how we tell msprime where we want the mutations: a fixed mutation is one that falls on the branch above the root of the tree; so if there is a branch above the root at a site we can get fixed mutations on it. Now, the branch above the root of a tree is redundant, in some sense - we know it's there, and it doesn't affect relationships between samples. The simplify() operation removes everything you don't need to reconstruct the trees, and so removes anything above the root (i.e., above the MRCA of all the samples). This explains that sentence in the SLiM manual. So : if you first simplify a tree sequence and then use msprime to add mutations, then none of the mutations that msprime added will be fixed. However, you say:
I'm guessing this is because you started with a tree sequence produced by SLiM that already had fixed mutations in it? I think that simplify would not remove these (but I'm not checking right now). So, I think what you need to do is simple: either (a) don't call And, good point about the tutorial - what it says is correct, but in the example provided there's no change. (so, it's a bad example for that point)
It's on the list to add this method, but for now you can look here: tskit-dev/tskit#504 Hope that helps? |
Beta Was this translation helpful? Give feedback.
-
Also: probably we should convert this to a discussion and raise the point about the example in a tutorial as a separate issue. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your reply.
Well, actually, I started my simulation without any mutation, but I designed a bottleneck event in it. Maybe the bottleneck event account for the persence of the fixed mutations? |
Beta Was this translation helpful? Give feedback.
-
Well, how to do this? Actually, it's my first time to comment on githubO(∩_∩)O |
Beta Was this translation helpful? Give feedback.
-
Oh god, I know why there are still some fixed mutations in my results after ran the simplify(). I calculated the mutation frequency in a sub-population which came from a 100/5000 sample rather than the whole population. My bad. |
Beta Was this translation helpful? Give feedback.
-
I had to convert it. Done! |
Beta Was this translation helpful? Give feedback.
-
Hi. I have some question about the simplify() method of tree sequence. I 'm a slim user. I need to run a simulation with neutral and deleterious mutations. Since it takes a lot of time, I decided to use tree sequence to avoid the neutral mutations in my simulation, and overlay them to the tree sequence after my simulation finished. When I tried to do that, I noticed this sentence in the slim manual: Here, our chosen goal is to overlay mutations only back to the point of coalescence, and so we call simplify() to strip away all ancestral information above the point of coalescence. (If we wanted to overlay fixed mutations as well, past coalescence back to the start of forward simulation, then we would not call simplify().)
Actually, I do need to overlay the fixed mutations, so I tried to understand the consequence of simplify(), but found it hard to understand. I overlaid neutral mutations to my tree sequence by msprime. I tried to call the "msprime.sim_mutations (...)" with and without the "ts.simplify()" before it. With that sentence in slim manual, I expected the results of the model with the simplify() would not have any fixed mutations, but it seems not true. When I reloaded the tree sequence back to slim, I found there are some fixed mutations. But the counts of fixed mutations between "with simplify()" and "without simplify()" are indeed different. I am confused about it. Did I misunderstand the sentence in the slim manual? Could you explain it for me and show an example in detail?
When I tried to find the tutorial of simplify(), I found that there are maybe a mistake in the "Completing forwards simulations". The last part of this tutorial seems to show the consequence of simplify(), which is the exactly thing I want to know, but the last figure is exactly the same figure as the figure before it. Did I misunderstand it? Or it's indeed a mistake?
By the way, I find it's not easy to access the frequency or count of each mutation. Are there some straight way to do that? Such as a single method or a property of mutation object?
Beta Was this translation helpful? Give feedback.
All reactions