seq_typing

Determines which reference sequence is more likely to be present in a given sample

Rational
Input requirements
Dependencies
- Install dependencies
Install seq_typing
Usage
Outputs
- seq_typing.py
- ecoli_stx_subtyping.py
Citation
Contact

Rational

seq_typing is a software to determine a given sample type using either a read mapping approach or a sequence Blast search against a set of reference sequences.
For the read mapping approach, the sample's reads are mapped to the given reference sequences using Bowtie2, parsed with Samtools and analysed via ReMatCh. Based on the length of the sequence covered and it's depth of coverage, seq_typing returns the type associated with the reference sequence which is more likely to be present. The selected sequence will be the one covered to a greater extent, with higher depth of coverage and with the highest identity (applied hierarchically following the order here described), that passes defined thresholds.
For the Blast approach (when using sequences fasta files) the sequence selected, for each DB sequence, is determined accordingly with the best Blast hit. The best hit is defined by the largest alignment length, highest similarity, lowest E-value and number of gaps, and largest reference sequence length (applied hierarchically following the order here described). The selected sequence criteria is the same used with the read mapping approach (although the depth of coverage will always be 1).
In both cases, manual curation and sequence type definition is required for reference sequences database production.

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
Docker		Docker
seqtyping		seqtyping
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

#sequence_type	reference_file	type	sequence	sequenced_covered	coverage_depth	sequence_identity	query	q_start	q_end	s_start	s_end	evalue	gaps
selected	O_type.fasta	O26	wzy_192_AF529080_O26	100.0	281.95405669599216	100.0	NA	NA	NA	NA	NA	NA	NA
selected	H_type.fasta	H11	fliC_269_AY337465_H11	99.4546693933197	51.76490747087046	99.86291980808772	NA	NA	NA	NA	NA	NA	NA
other_probable_type	O_type.fasta	O26	wzx_208_AF529080_O26	100.0	223.3072050673001	100.0	NA	NA	NA	NA	NA	NA	NA
other_probable_type	H_type.fasta	H11	fliC_276_AY337472_H11	98.84117246080436	37.52551724137931	99.86206896551724	NA	NA	NA	NA	NA	NA	NA

#sequence_type	reference_file	type	sequence	sequenced_covered	coverage_depth	sequence_identity	query	q_start	q_end	s_start	s_end	gaps
selected	1_GenotypesDENV_14-05-18.fasta	3-III	gb:EU529683#...#Subtype:3-III#Host:Human#seqTyping_3-III	100.0	1	99.223	NODE_1_length_10319_cov_2021.782660	138	10307	10170	1	0
other_probable_type	1_GenotypesDENV_14-05-18.fasta	1-V	gb:GQ868570#...#Subtype:1-V#Host:Human#seqTyping_1-V	100.0	1	99.479	NODE_2_length_10199_cov_229.028848	13	10188	1	10176	0
other_probable_type	1_GenotypesDENV_14-05-18.fasta	4-II	gb:GQ868585#...#Subtype:4-II#Host:Human#seqTyping_4-II	100.0	1	99.38	NODE_4_length_10182_cov_29.854132	13	10173	1	10161	3

#sequence_type	reference_file	type	sequence	sequenced_covered	coverage_depth	sequence_identity	query	q_start	q_end	s_start	s_end	evalue	gaps
selected	1_virulence_db.stx1_subtyping.fasta	stx1a	stx1A:15:AF461168:A:seqTyping_stx1a	100.0	65.37447257383967	100.0	NA	NA	NA	NA	NA	NA	NA
selected	2_virulence_db.stx2_subtyping.fasta	stx2c	stx2B:15:AB071845:C:seqTyping_stx2c	100.0	19.377777777777776	100.0	NA	NA	NA	NA	NA	NA	NA
other_probable_type	1_virulence_db.stx1_subtyping.fasta	stx1c	stx1B:11:AB071620:C:seqTyping_stx1c	100.0	21.64814814814815	99.25925925925925	NA	NA	NA	NA	NA	NA	NA
other_probable_type	1_virulence_db.stx1_subtyping.fasta	stx1a	stx1B:14:AM230663:A:seqTyping_stx1a	100.0	45.06666666666667	100.0	NA	NA	NA	NA	NA	NA	NA
other_probable_type	2_virulence_db.stx2_subtyping.fasta	stx2c	stx2B:10:EF441604:C:seqTyping_stx2c	100.0	17.2	99.25925925925925	NA	NA	NA	NA	NA	NA	NA
other_probable_type	2_virulence_db.stx2_subtyping.fasta	stx2d	stx2B:11:FM998840:D:seqTyping_stx2d	100.0	9.996296296296297	99.62962962962963	NA	NA	NA	NA	NA	NA	NA

License

B-UMMI/seq_typing

Folders and files

Latest commit

History

Repository files navigation

seq_typing

Rational

Input requirements

Dependencies

Install dependencies

Install seq_typing

Usage

General use

General info

index module

reads module

blast module

assembly module

Organisms typing

Usage examples

Reads

Assemblies

E. coli stx subtyping

General usage

ecoli_stx_subtyping Reads

ecoli_stx_subtyping Assembly

Blast

Update stx references

Container

Outputs

seq_typing.py

ecoli_stx_subtyping.py

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Languages

Packages