Releases: marbl/harvest-tools
Releases · marbl/harvest-tools
HarvestTools v1.3
New features
- VCF annotations - There are new INFO fields in VCF output indicating what CDS locus, if any, each variant lies in, the amino acid changes for each allele, and whether all alleles are synonymous.
- Internal variants - The
--internal
option for VCF export restricts to positions that differ within a group of genomes. The group can be specific track names or a clade, specified as the LCA of two tracks. Columns will be restricted to the genomes of interest. - Signature variants - The
--signature
option for VCF export restricts to positions that are consistent within a group of genomes and different outside of it. The group can be specific track names or a clade, specified as the LCA of two tracks. All columns will be included.
Changes
- GenBank annotations are now reconciled with fasta reference sequences using accessions rather than GIs, due to the retirement of GIs by NCBI.
v1.2
Bug fixes
- Large alignments will no longer run into serialization size limits when creating Gingr files (Issue #3), thanks to the Cap'n Proto library. Older Gingr files in the Protocol Buffer format can still be read, but will not be written (and may eventually be phased out).
- Exported trees will no longer have bootstrap or branch length values for the root node, which was not standard Newick and could cause crashes when loading them back in.
- Exported Fasta reference files will no longer have an extra space between the ID and description in their tags.
v1.1
Features
- VCF import now supports indels. See the documentation for details and caveats.
- MAF alignments can now be imported. See the documentation for details and caveats.
- Concatenated MFA (multi-fasta), in which all LCBs are joined, can now be output. Note that this will create spuriously contiguous sequences at LCB boundaries and should be used with caution.
- Tree branch lengths can be adjusted (with
-u
) to account for trees generated on only variant columns, rather than entire multi-alignments (Gingr v1.1.1 is needed to display the adjusted lengths).
Bug fixes
- Non-wrapped Fasta files (with each full sequence on one line) no longer cause hangs when importing as references.
- XMFA output when the final LCB contains no variants will no longer crash or produce infinite output.
Changes
- The binary has been renamed from
harvest
toharvesttools
.