HIV-TRACE is an application that identifies potential transmission clusters within a supplied FASTA file with an option to find potential links against the Los Alamos HIV Sequence Database.
- gcc >= 6.0.0
- python3 >= 3.5.1
- tn93 >= 1.0.6
HIV-TRACE requires tn93 be installed and python3.
pip3 install biopython
pip3 install numpy
pip3 install scipy
pip3 install hivtrace
Tested with Python 3.5.1
, 3.5.3
, and 3.6.1
.
hivtrace -i ./INPUT.FASTA -a resolve -r HXB2_prrt -t .015 -m 500 -g .05 -c
A FASTA file, with nucleotide sequences to be analyzed. Each sequence will be aligned to the chosen reference sequence prior to network inference. Sequence names may include munged attributes, e.g. ISOLATE_XYZ|2005|SAN DIEGO|MSM
Handle ambiguious nucleotides using one of the following specified strategies.
Option | Description |
---|---|
resolve | count any resolutions that match as a perfect match |
average | average all possible resolutions |
skip | skip all positions with ambiguities |
gapmm | count character-gap positions as 4-way mismatches, otherwise same as average |
For more details, please see the the MBE paper.
The sequence that will be used to align all provided sequences to. It is assumed that the input sequences are in fact homologous to the reference and do not have too much indel variation.
Option | Description |
---|---|
HXB2_vif | Viral Infectivity Factor |
HXB2_vpu | Viral Protein U |
HXB2_int | |
HXB2_vpr | Viral Protein R |
HXB2_pr | |
HXB2_pol | The genomic region encoding the viral enzymes protease, reverse transcriptase, and integrase |
HXB2_tat | Transactivator of HIV gene expression |
HXB2_rt | |
NL4-3_prrt | |
HXB2_prrt | |
HXB2_nef | 27-kd myristoylated protein produced by an ORF located at the 3' end of primate lentiviruses |
HXB2_gag | The genomic region encoding the capsid proteins (group specific antigens) |
HXB2_env | Viral glycoproteins produced as a precursor (gp160) |
HXB2_rev | The second necessary regulatory factor for HIV expression |
Path/to/FASTA/file | Path to a custom reference file |
Please reference the landmarks of the HIV-1 genome if the presets seem foreign to you.
Two sequences will be connected with a putative link (subject to filtering, see below), if and only if their pairwise distance does not exceed this threshold.
Only sequences who overlap by at least this many non-gap characters will be included in distance calculations. Be sure to adjust this based on the length of the input sequences. You should aim to have at least 2/(distance threshold) aligned characters.
Affects only the Resolve option for handling ambiguities. Any sequence with no more than the selected proportion [0 - 1] will have its ambiguities resolved (if possible), and ambiguities in sequences with higher fractions of them will be averaged. This mitigates spurious linkages due to highly ambiguous sequences.
Screen for contaminants by marking or removing sequences that cluster with any of the contaminant IDs.
Option | Description |
---|---|
remove | Remove spurious edges from the inferred network |
report | Flag all sequences sharing a cluster with the reference |
separately | Flag all sequences and report them via secondary tn93 command |
none | Do nothing |
Use a phylogenetic test of conditional independence on each triangle in the network to remove spurious transitive connections which make A->B->C chains look like A-B-C triangles.
Option | Description |
---|---|
remove | reports supurious transitive connections |
report | removes supurious transitive connections |
Masks known DRAMs (Drug Resistance-Associated Mutation) positions from provided sequences.
Option | Description |
---|---|
lewis | Mask (with ---) the list of codon sites defined in Lewis et al. |
wheeler | Mask (with ---) the list of codon sites defined in Wheeler et al. |
Compare uploaded sequences to all public sequences. Retrieved periodically from the Los Alamos HIV Sequence Database
Specify output filename. If no output filename is provided, then the output filename will be <input_filename>.results.json
You can either use the command hivtrace_viz <path_to_json_file>
or visit https://veg.github.io/hivtrace-viz/
and click Load File.