Edge_Diffs() with >4 edges_out (and edges_in) #1835
-
Hi, I'm big fan of the tskit and msprime software, use them all the time. In the course of a recombination inference analysis, I noticed that for some tree transitions there are more than 4 edges_out (= #edges_in). As far as I can figure, a single recombination event should only be compatible with 2,3, or 4 edges_out (= #edges_in). I am a little confused as to how this makes sense. It only occurs for a small fraction of transitions for vanilla simulations of a single pop. In msprime, can a single transition possibly encode multiple recombination events? Any pointers as to what is possibly going on? As far as I can tell this does not happen for a vanilla ms.simulate() tree_seq, but it does happen when using ms.sim_ancestry() for what I think should be the same single pop, constant size, simulation. See examples below to see what I mean exactly. The # edges in/out = 2,3,4 for a single recomb event seems to accord with Kelleher, Jerome, Alison M. Etheridge, and Gilean McVean. "Efficient coalescent simulation and genealogical analysis for large sample sizes." PLoS computational biology 12.5 (2016), but I am not sure on that. Thanks! Example cases
ms.sim_ancestry:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Good question, @mebh! With msprime v1.0, we've switched to a default of a discrete genome, i.e., having recombinations occur at integer positions. This applies to |
Beta Was this translation helpful? Give feedback.
Good question, @mebh! With msprime v1.0, we've switched to a default of a discrete genome, i.e., having recombinations occur at integer positions. This applies to
sim_ancestry
, as you've noted, but not tosimulate
, as we wanted to keep things backwards compatible. So, what you've observed is just what I'd expect - a small number of breakpoints that have been hit multiple times through the course of the simulation, just as now by defaultsim_mutations
will produce a (usually small) proportion of sites with multiple mutations.