This repository is the current and maintained version of Recursive PFP from https://github.com/marco-oliva/r-pfbwt. That repository is now depreceated. Any issues should be posted here.
rPFP
is a tool to build the run-length encoded BWT and the SA values at the run heads from the prefix-free
parsing of the input data.
rPFP
is available on docker:
docker pull moliva3/r-pfbwt:latest
docker run moliva3/r-pfbwt:latest rpfbwt --help
If using singularity:
singularity pull rpfbwt_sif docker://moliva3/r-pfbwt:latest
./rpfbwt_sif rpfbwt --help
rPFP
can be built using:
git clone git@github.com:EddieFerro/rPFP.git
cd rPFP
mkdir build && cd build
cmake ..
make -j
rPFP
takes as input the prefix-free parse of the input data, namely a dictionary D1 and a parse P1, and
the dictionary D2 and the parse P2 obtained by prefix-free parsing P1. Note that rPFP
does not use P1,
it only uses D1, D2 and P2. In order to compute the prefix-free parsing of the input data we can use pfp++
(link here). The following example computes the run-length encoded BWT of multiple
sequences of yeast.
wget https://gitlab.com/manzai/Big-BWT/-/raw/f67022fe74dae0234e516324103613a0fdd58a6e/yeast.fasta?inline=false -O ./yeast.fasta
pfp++ -f yeast.fasta -w 10 -p 100 --output-occurrences
pfp++ -i yeast.fasta.parse -w 5 -p 11
rpfbwt --l1-prefix yeast.fasta --w1 10 --w2 5 --threads 10
We report here all the available parameters for rPFP
rpfbwt
Usage: rpfbwt [OPTIONS]
Options:
-h,--help Print this help message and exit
--l1-prefix TEXT REQUIRED Level 1 Prefix.
--w1 UINT REQUIRED Level 1 window length.
--w2 UINT REQUIRED Level 2 window length.
-t,--threads UINT Number of threads.
--chunks UINT:INT in [1 - 1000]
Number of chunks.
--integer-shift UINT Integer shift used during parsing.
--tmp-dir TEXT:DIR Temporary files directory.
--bwt-only Only compute the RLBWT. No SA values.
--version Version number.
--configure Read an ini file