Author: Priya Lakra
This is the script for constructing core genome phylogeny for taxonomic purposes. It involves extraction of core genome from the genome assemblies of closely and/or distantly related species, and employing this core genome for construction of phylogenomic tree.
get_homologues package is required for extracting core and pan genomes among different species.
MAFFT is used for generating multiple sequence alignment.
For concatenating alignments into a super matrix, one can use
- Sequence matrix
- Custom scripts
IQtree and MEGA X can be used for generating phylogenomic trees.
iTOL is used for tree visualization and editing. Figtree can also be used.
`git clone https://github.com/PriyaLakr/coreGenomePhylogeny.git`
`cd PATH/TO/coreGenomePhylogeny`
`bash pl_install_depend.sh`
You can check installation information in the file "install.info"
`bash pl_getcg.sh [options]`
for help: run bash pl_getcg.sh -h
`bash pl_treegen.sh [options]`
for help: run bash pl_treegen.sh -h
`bash pl_concat.sh`
for help: run bash pl_concat.sh -h
Notes: Another way is to create trees using individual gene files and concatenate individual gene trees to create a super tree. Read first if you really require this approach.
`bash pl_treegen.sh [options]`
Reference data adapted from Lakra, P et al., 2021.