Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is our point mutation operator biologically sound? #76

Open
mauriceling opened this issue Mar 26, 2014 · 0 comments
Open

Is our point mutation operator biologically sound? #76

mauriceling opened this issue Mar 26, 2014 · 0 comments
Assignees
Labels

Comments

@mauriceling
Copy link
Owner

In our philosophical paper, we need to know if computational mutation operators are good representation of biological mutations. Single mutation is easy to explain and argue for but repeated mutations can be tricky.

Volles and Lansbury (2005, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1166583/) provided data on error-prone PCR results which we can use as base data to compare against.

Basically, PCR is an experimental technique to amplify DNA where the number of DNA molecules double per cycle. From 1 --> 2 --> 4 --> 8... In the paper, 30 cycles were used, so 1 molecule is amplified to 1^9 molecules. The mutation rate is dependent on the enzyme used. In this case, it is Taq polymerase with the error rate of 1 point mutation per 9000 (http://www.ncbi.nlm.nih.gov/pubmed/2847780). Then they randomly pick 89 of the pool of 9 billion for evaluation (this is costly)

Allowable bases: a, t, g, c

Mutation rate: 1.0/9000

Initial sequence: ctttcaaaggccaaggagggagttgtggctgctgctgagaaaaccaaacagggtgtggcagaagcagcaggaaagacaaaagagggtgttctctatgtaggctccaaaaccaaggagggagtggtgcatggtgtggcaacagtggctgagaagaccaaagagcaagtgacaaatgttggaggagcagtggtgacgggtgtgacagcagtagcccagaagacagtggagggagcagggagcattgcagcagccactggctttgtcaaaaaggaccagttgggcaagaatgaagaaggagccccacaggaaggaattctggaagatatgcctgtggatcctgacaatgaggcttatgaaatgccttctgaggaagggtatcaagactacgaacctgaagcctaa

Volles and Lansburg (2005) results are as follows:

Wild-type base: A G C T
A 11233 64 19 73
G 20 11979 2 12
C 8 2 6296 13
T 44 8 24 5975

289 errors out of 35772 bases examined, which gives the detected error rate of about 0.0081 detected mutation per base.

Let's see if we can replicate this 0.0081.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants