git clone --recursive https://github.com/Cibiv/BODscore
cd BODscore
bash ./install.sh
Re-evaluates the quality of called SNPs (provided in VCF format). Returns:
-
the reliability of a SNP (BOD score) based on the described geometric method
-
per-nucleotide coverage of the reference around SNPs is returned either as:
-
a tab-delimited file (coverage for the range of
+/- floor(1.5 * read length)
around the SNP, tab-separated file, with blocks delimited by\t|\t
): -
an SQLite3 database with coverage written as blob columns (a faster option). The database can be read with the provided Python
ReadCoverage.py
class and related scripts
-
-
following coverage data is returned within
QdrArray
objects:- accounting for perfectly matching reads with 100% identity (
HI
) and loosely matching, between 90% and 100% identity (LO
) - for forward and reverse strands
- accounting for perfectly matching reads with 100% identity (
-
three
QdrArray
objects with coverage profiles are returned:- for all reads in the range (
totCov
column in the SQLite database) - for reads covering only the locus proper (
snpCov
column) - location of centres of the reads covering the SNP locus (
alnCtr
column)
- for all reads in the range (
-
the coverage data is written in a
sampleX__coverage
table with a composite key of(contig, pos)
, wherecontig
is ofTEXT
type andpos
(position) ofINT
type.
Test sample files for running the program are in the folder testcase/
.
See bash script testcase/db_vshape_test.sh
and
testcase/tab_vshape_test.sh
for example of vshape
invocation.
For conversion sqlite3 -> csv
see testcase/dbtocsv.sh
A Python script (src/plot_coverage_sqlite.py
) is provided to produce the plots of the coverage.
An earlier R script (src/rscript.R
) is outdated and is not compatible with the current version.
If you have any questions/concerns please drop an email to:
fritz.sedlazeck@gmail.com
d.lituiev@gmail.com