eKLIPse is no longer maintained.
A unpublished version (v2.1) with duplication integration
is available in this repository.
Duplication are defined using MitoSAlt approach (Basu et al. 2020).
Outputs was also improved.
eKLIPse is a sensitive and specific tool allowing the detection and quantification of large mtDNA rearrangements.
Based on soft-clipping it provides the precise breakpoint positions and the cumulated percentage of mtDNA rearrangements at a given gene location with a high detection sensitivity.
Both single and paired-end (mtDNA, WES, WGS) data are accepted.
eKLIPse requires two types of input, the BAM or SAM alignment files (with header) and the corresponding mitochondrial genome (GenBank format).
Alignment must contains soft-clipping information (see your aligner options).
eKLIPSE is available either as a script to be integrated in a pipeline, or as user friendly graphical interface.
- Like others CNV tools, eKLIPse performance will depend on your sequencing and mapping steps.
- download lastest version 080620 here.
- unzip ZIP file.
- launch 'eKLIPse.exe'
- Space not allowed in executable and input/output path
- install required tools (see Requirements section)
- download lastest version here.
- unzip Qt_eKLIPse_unix_v1-0.zip
- cd Qt_eKLIPse_unix_v1-0.zip
- chmod a+x eKLIPse
- ./eKLIPse
{ width=30% }
To start analysis, simply click "START".
(you can change the colors by clicking on the bottom right colors)
1 - To select your alignment files, click "ADD". If required you can change alignment title by selecting corresponding cell.
2 - Select your reference genome. If you choose "Other", browse to your own Genbank file by clicking on the folder icon.
3 - To change "results directory", click on the folder icon.
4 - To modify "Advanced parameters" click on the expand icon. Please refers to "Parameters" section for further information.
5 - Launch analysis by clicking "START"
eKLIPse analysis detailed progress can be followed on this window.
Once the analysis is complete, the program automatically opens the result folder.
Two reduced alignment files are provided with the archive file.
Click "TEST" on the "Launch Analysis" windows before clicking "START".
A docker image is also available. Follow building instruction here
Please install the following modules & tools:
- python 2.7
- biopython
- tqdm
- samtools
- blastn & makeblastdb (>=2.3.0+)
- circos
python eKLIPse.py --test
(*add "-samtools", "-blastn", "-makeblastdb" and "-circos" options if not in $PATH)
python eKLIPse.py -in <INPUT file path> -ref <GBK file path> [OPTIONS]
[OPTIONS]
-out <str> : Output directory path [current]
-tmp <str> : Temporary directory path [/tmp]
-scsize <int> : Soft-clipping minimal length [25]
-mapsize <int> : Upstream mapping length [20]
-downcov <int> : Downsampling read number [500000] (0=disable)
-minq <int> : Read quality threshold [20]
-minlen <int> : Read length threshold [100]
-shift <int> : Breakpoint sliding-window size [5]
-minblast <int> : Minimal number of BLAST per breakpoint [1]
-bilateral <bool> : Filter unidirectional BLAST [True]
-mitosize <int> : Remove deleted mtDNA less than [1000]
-id <int> : BLAST %identity threshold [80]
-cov <int> : BLAST %coverage threshold [70]
-gapopen <int> : BLAST cost to open a gap [0:proton, 5:illumina]
-gapext <int> : BLAST cost to extend a gap [2]
-thread <int> : Thread number [2]
-samtools <str> : samtools bin path [$PATH]
-blastn <str> : BLASTN bin path [$PATH]
-makeblastdb <str> : makeblastdb bin path [$PATH]
-circos <str> : circos bin path [$PATH]
--test : eKLIPse test
--nocolor : Disable output colors
eKLIPse accepts alignments in BAM or SAM format (require header) for both single and paired-end sequencing data.
The input file is a simple tabulated text file as follow:
path_bam | title1 |
path_bam2 | title2 |
eKLIPse accepts any mtDNA reference genome in Genbank format.
rCRS (NC_012920.1.gb), CRS (J01415.2.gb) and Mus musculus (NC_005089.1.gb) are provided in "/data"
In order to reduce the execution time, a downsampling option is available.
For singles deletions with low mutant load or multiple deletions, we advise to not downsample "-downcov 0".
The obtained read number should match to a sufficient mitochondrial genome coverage.
According to your sequencing technology and library, you can adjust the minimum read length value (-minlen).
You can adjust minimum read quality (-minq), for example to consider multiple hits for a same read which reduce the minq.
For short read data, we advise to reduce minimal soft-clipping length (-scsize) and upstream mapping length (-mapsize).
For example, with 100bp reads, you could use "-scsize 15" and "-mapsize 10".
Breakpoint sliding-window size could be modify if you expect a high number of homopolymers.
BLASTn thresholds are mostly sequencing technology dependent.
Then according to your sequencing quality you could increase or decrease identity and coverage thresholds (-id / -cov).
Illumina is known to generate fewer errors and can therefore be more stringent on gap thresholds (-gapopen / -gapext).
For example, with illumina reads, you could use "-gapopen 5" and "-gapext 2".
According to your sequencing depth, quality and required stringency, you could modify filters.
Increasing the minimum number of BLAST per breakpoint increase the specificity but decrease the sensitivity (-minblast)
By default, eKLIPse filter out deleted mtDNA with a length under 1000bp.
But for example, if you're looking for sublimons you could reduce this length to 100bp.
eKLIPse is based on the search of bidirectional BLAST linking 5' and 3' breakpoints.
It is therefore not recommended to disable this filter ("-bilateral False").
File containing all predicted deletions (bkp=breakpoint).
Title | 5'bkp | 3'bkp | Freq | Freq for | Freq rev | 5' Blast | 3' Blast | 5' Depth | 3' Depth | Repetition |
file1 | 7753 | 14601 | 3,46 | 0,38 | 6,55 | 2 | 23 | 1393 | 412 | 7754-GA-7755 | 14601-GA-14602 |
file2 | 7981 | 14955 | 7,40 | 4,28 | 10,51 | 2408 | 2506 | 7080 | 2544 | 7982-CT-7983 | 14955-CT-14956 |
file3 | 460 | 5243 | 7,24 | 13,72 | 0,76 | 7 | 1 | 72 | 197 | 458-CT-459 | 5242-CT-5243 |
File summarizing cumulated deletions per mtDNA gene.
Gene | Start | End | Type | file3 | file4 | file5 |
MT-TF | 577 | 647 | trna | 0,38 | 0,82 | 14,03 |
MT-RNR1 | 648 | 1601 | rrna | 2,27 | 14,42 | 14,03 |
MT-TV | 1602 | 1670 | trna | 2,27 | 14,42 | 14,03 |
MT-RNR2 | 1671 | 3229 | rrna | 2,27 | 14,78 | 14,03 |
MT-TL1 | 3230 | 3304 | trna | 2,27 | 14,78 | 14,03 |
MT-ND1 | 3307 | 4262 | protein | 2,27 | 15,05 | 14,03 |
One plot is created per input alignment. An example is shown below.
eKLIPse is available under the GNU Affero General Public License v3.0.
Please cite (submitted article)
eKLIPse: A sensitive tool for the detection and quantification of mitochondrial DNA deletions from next generation sequencing data.