The original PAN-GWES repo is located here. This duplicate repo provides a quick and easy interface for installation and testing.
conda create -n pangwes
conda activate pangwes
mamba install -c bioconda -c conda-forge pangwes
If you installed conda, run the last command using conda inplace of mamba.
If you are an ARM-64 user (Macs with M series chips), please follow these steps to change your channel to osx-64:
conda create -n pangwes
conda activate pangwes
conda config --env --set subdir osx-64
conda install -c bioconda -c conda-forge pangwes
It is recommended to use the conda command (instead of mamba) for installation after channel change.
- If you haven’t already, first, you will need to install conda.
- Next, install conda build tools:
conda activate base
conda install conda-build
- Clone the repo
git clone https://github.com/Sudaraka88/PAN-GWES
Alternatively, you can click the <>Code button at the top right and click Download ZIP. Afterwards, unzip the repo.
- Build the repo
cd PAN-GWES
conda build -c bioconda -c conda-forge sw
- Create a new environment and install the package
conda create -n pangwes -c bioconda -c conda-forge pangwes --use-local
When prompted, enter y to confirm the installation of pangwes and dependencies
SpydrPick and Cuttlefish are available via bioconda as osx-64 dependencies. You can create a conda environment that uses the osx-64 channel. Follow steps 1-3 first, then:
- Create a new conda environment and change channels
conda create -n pangwes
conda activate pangwes
conda config --env --set subdir osx-64
- Build the package
cd PAN-GWES
conda build -c bioconda -c conda-forge sw
- Install the package
conda install -c bioconda -c conda-forge pangwes --use-local
Please refer to the original repo
There is a sample dataset available here. This compressed file contains a single folder called efcls_assemblies, which contains 337 E. faecalis assemblies. Uncompress the downloaded file and move the efcls_assemblies folder into your working directory.
- Open a terminal in your working directory and populate a list of these assemblies using the following command.
ls -d efcls_assemblies/* > efcls_assemblies.txt
- Activate the pangwes conda environment
conda activate pangwes
- Build the pangenome graph using cuttlefish
cuttlefish build --list efcls_assemblies.txt --kmer-len 61 --output efcls --threads 16 -f 1
You can provide an optional work directory to store temporary files using –work-dir and adjust the number of threads depending on your system resources. If you run into the error “Cannot open temporary file ./kmc_01021.bin”, run
ulimit -n 2048
in the terminal. See here and here.
- Prase the built gfa1 file
gfa1_parser efcls.gfa1 efcls
Warning! The next couple of steps will take a bit of time. Remember to adjust the number of threads depending on your system resources.
- Run SpydrPick on the unitig fasta alignment
SpydrPick --alignmentfile efcls.fasta --maf-threshold 0.05 --mi-values 50000000 --threads 16 --verbose
- Calculate unitig distances
unitig_distance --unitigs-file efcls.unitigs --edges-file efcls.edges --k-mer-length 61 --sgg-paths-file efcls.paths --queries-file efcls.*spydrpick_couplings*edges --threads 16 --queries-one-based --run-sggs-only --output-stem efcls --verbose
- Generate the pangenome Manhattan Plot
./gwes_plot.r -i efcls.ud_sgg_0_based -n 337
Warning! This script will try to install the required dependencies. Use --no-deps to generate the plot without installing any packages (slower!).
If all went to plan, this example should generate the following figure: