-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mistake in choosing parallele contigs from embplant_pt-embplant_mt.fastg: mt vs pt #215
Comments
Thanks for reaching out with such detailed information! A great form of an issue report. Ideally, this should be avoided. However, due to the contig 61416 (9.46x) having a better blast hit for two genes (partial ycf2 & ycf15), compared to the contig 61026 (158.3x) which has only partial ycf2, the current default algorithm preferred contig 61416. And if you use online blast, you may find that contig 61416 is aligned to the chloroplast genomes, indicating that it may be a recent pt transfer into mt without merging other mt loci. Generally, this can be a complex problem because there can be an opposite situation where blast hits are indeed more important than the depths in contig type classification. A possible future solution for GetOrganelle may be using an integrated likelihood framework to weigh the blast hits and depths together, rather than the multiple-step framework implemented currently. @wbyu Let's leave this issue open until solving without parameter finetuning. For now, a simple solution would be using
Please let me know if it makes sense.
|
Any updates? |
Hi Dr Jin, We have successfully extracted 19/24 true plastomes after we used the suggested script and adjusted the parameter of depth-factor at the range from 3 to 7. I attach the related files of both before and after adjustment of one failed sample here. Thanks. |
For this failed sample, increasing the depth factor to as strict as 1.5 would work. I guess you can do similar parameter fine-tuning for the rest 5 samples. Please let me know the updates. |
All samples were extracted successfully. Thank you so much! |
Hi Dr Jin,
I have assembled my data into a circle plastome already, but I found a segment in IR regions with extremely lower coverage than other regions when I mapped reads to the sequence.
After checking the mt.fastg file, I found GetOrganelle choose #61416 with coverage of 9.46x rather than #61026 with coverage of 158.3x when it disentangled parallel contigs.
I also found this situation in the same region of some other samples. Sometimes I can find two disentangling results from two samples of a same species, one displayed as mentioned above, the other one showed uniform coverage. Although I find the way to edit mt.fastg file manually, I have plenty of plastomes need to check.
I'm wondering why it is the case, and if there is a quick way to fix it. Could you kindly give any suggestions?
Thank you.
Best,
Zhi Yang
zhiyang@njfu.edu.cn
I attached the related file here
1477.zip
The text was updated successfully, but these errors were encountered: