Replies: 1 comment 2 replies
-
Hi @qianfeng2 ! 👋 You can use the haplotypes method on the tree sequence to get the variable sites for each sample, something like this: ts = msprime.sim_ancestry(10, sequence_length=100)
ts = msprime.sim_mutations(ts, rate=1)
for h in ts.haplotypes():
print(h) The sim_mutations function is quite powerful and supports most of what Pyvolve does. It might be helpful to go through some of the main documentation page for msprime's mutations. An alternative to printing out the haplotypes would be to export your sequences in VCF format. See the saving and exporting data part of the tutorial for more info. Hope this helps! |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear whom it may concern,
I how you are well:)
I am using powerful msprime 1.0.2 to simulate the genealogical history with recombination, and then plan to use the generated sequences (which are represented by the present-day nodes in the tree) for downstream analysis.
Is msprime able to return the full DNA sequences for each extant node? For instance, I hope simulate 10 100bp sequences with recombination rate 0.1, can msprime return 10 raw sequences, each is ATGCGGC...... like.
Alternatively, there would be a set of trees due to recombination from msprime. I need to simulate sequences from each of these tree using other simulation tool, like Pyvolve of INDELible, and then concatenate them to produce a sequence alignment.
Thanks in advance,
Qian
Beta Was this translation helpful? Give feedback.
All reactions