This git repository contains the workflows used in the study "Boosting Felsenstein Phylogenetic Bootstrap".
They are implemented in Nextflow.
-
Original data is located in the
data
folder:mammals
: Raw alignment and annotation filencbitax
: Ncbi taxonomyvih
: Raw HIV pol alignment
-
Each sub-folder corresponds to one analysis:
- hiv_pol: analysis of 9147 sequences of hiv pol
- mammals_COI5P: analysis of 1449 sequences of COI-5P protein in mammals;
- mammals_simulated: analysis of simulated data;
- bootstrap_simulated_bootstrap: Comparing bootstrap samples with Simulated samples from true tree
- transfer_distance: analysis of transfer distance as a function of branch depth and number of taxa.
-
All data described in the paper can be downloaded here.
Each folder contains a run.sh
script to launch the analyses.
To run all the pipelines, the following dependencies are necessary: