Questions about using GetOrganelle to assemble/bait Angiosperms353 loci sequences #309
Replies: 2 comments 2 replies
-
I am pleased to learn that GetOrganelle has benefited your research, and I appreciate you bringing this issue to our attention. While there have been inquiries about assembling custom loci using GetOrganelle, and I am aware of the Corydalis study you mentioned, I must admit that I had not previously delved into their methodology. It is indeed intriguing to see that their approach results in improved downstream analysis compared to Easy353, and their rationale appears sound. However, I have not yet incorporated 353 data into my own research. Regarding the current version of GetOrganelle, it is not specifically optimized for this particular purpose, which means there is no straightforward one-command solution available. Nonetheless, since you have successfully created a customized label database and utilized it for assembly, the subsequent steps should be more manageable. Here is what you should do next: Examine the accompanying *.csv file (which is actually in TAB format) produced by GetOrganelle. This file contains all the gene marker labels for the target contigs. In your downstream process, you will need to create and execute a custom script that 1) parses the *.csv TAB file and creates a map of contig-to-gene associations, and 2) uses this map to extract the relevant contigs from the current FASTA output, storing them as individual gene.fasta files. Should you encounter any difficulties, please do not hesitate to contact me. If you need assistance with scripting, feel free to send me one of your current outputs (*.csv and *.fasta), and I may be able to help in crafting a Python script. |
Beta Was this translation helpful? Give feedback.
-
Hi Dr. @JianjunJin, I'm not sure if it is recommended to open another discussion, but since this following question is still related to "using GetOrganelle to assemble Angiosperms353 loci sequences", and in order to make others easy to locate my questions which they may also encounter, I've decided to edit the title of this discussion. Please let me know if I should split this comment apart. After playing around with my data, I found out that there were two "edges" that has more than one BLAST hits (as seen in the attached
Eventually, I've chosen to test the third option, and discovered that the number of loci sets included in the seed file (e.g., if I take Angiosperms353 loci from 2 samples as seed, then there will be 2 sets of loci) affected the resulting number of scaffolds, and also the presence of "chimeric" scaffolds in the
For the seed & label files that included "many" sets of loci, I've inputted the I'm not quite sure how to interpret this result and whether I can use these assembled sequences in further analyses, since I'm afraid that there might be some aspects that I've been missing (e.g., assembly errors, wrong associations between scaffolds and loci). Therefore, any suggestions are appreciated! Thank you Dr. Jin, if any other files or record is needed please let me know. K01-353extended_K77.assembly_graph.fastg.extend-label.csv |
Beta Was this translation helpful? Give feedback.
-
First of all, thank you very much for this really convenient tool! I have successfully assembled plastome sequences and nrDNA cistron sequences for my research. :D
I found out in Chen et al. (2023) that GetOrganelle could be used to assemble Angiosperms353 loci sequences, and the assembled sequences yield better downstream analysis results than Easy353. However, since assembling one loci at once is too time & resource consuming (even if GNU parallel is used), I tried to generate my own label database and am able to assemble all loci sequences in the same round.
But now I face another issue: the sequences in
*.scaffolds.graph1.1.path_sequences.fasta
do not have headers containing loci names, and manually BLASTing these sequences or identifying them via Bandage is laborious as well. Therefore, can you kindly suggest a convenient/lazy way to do this? I'm really sorry if I overlooked something in the Wiki and Discussion, leading to me asking these ignorant questions.Anyways, thank you for your time and patience regarding to this matter. Looking forward to hearing from you! ^^
Beta Was this translation helpful? Give feedback.
All reactions