Skip to content

Commit

Permalink
Merge pull request #44 from campanam/v1.8.0-development
Browse files Browse the repository at this point in the history
V1.8.0 development
  • Loading branch information
campanam authored Aug 1, 2023
2 parents 77d1d70 + 114c8cc commit da61ef0
Show file tree
Hide file tree
Showing 8 changed files with 93 additions and 17 deletions.
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ Contact: campanam@si.edu
[Deprecated](#deprecated)

## aln2baits
### Version 1.8.0
Added --maxvars option to make variant haplotype definition more efficient/function properly

### Version 1.7.8
Fixed bug in --shuffle option that caused infinite loop
Fixed hanging bug when running variant options
Expand Down Expand Up @@ -142,6 +145,9 @@ Version constant added to header
Preliminary script to generate baits from an annotation file and a reference sequence

## baitslib
### Version 1.8.0
Handling for --maxvars option

### Version 1.7.4
mean method uses .sum rather than .reduce(:+)

Expand Down Expand Up @@ -295,6 +301,9 @@ New method write_probes handles basic output
filter_probes definition removed into separate script for access by other scripts

## baitstools
### Version 1.8.0
Handling for --maxvars option

### Version 1.7.4
Conversion to a RubyGem

Expand Down Expand Up @@ -459,6 +468,9 @@ The word 'probe' changed to 'baits' in all instances for clarity
Set default for tiling offset as 20 bp (from 60 for select_snps and 25 for tile_probes)

## baitstoolsgui
### Version 1.8.0
Handling for --maxvars option

### Version 1.7.5
Fixed bug calling baitstools.rb rather than updated baitstools executable

Expand Down Expand Up @@ -655,6 +667,9 @@ Version constant added to header
Preliminary script to filter predefined baits through quality filters

## osx_install
### Version 1.8.0
Installs latest baitstools gem (1.8.0)

### Version 1.7.8
Installs latest baitstools gem (1.7.8)

Expand Down
18 changes: 12 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,14 +43,18 @@ The software is made available under the Smithsonian Institution [terms of use](
General instructions for installation using RubyGems/Bundler and specific instructions for macOS are provided below. You can test your BaitsTools installation by running the tutorials included in the example_data directory. The archive "tutorial.tgz" includes the expected output of each tutorial. Note that vcf2baits and stacks2baits output will vary slightly due to the random number generator.

### Installation using RubyGems and Bundler
The BaitsTools executables can be installed using [RubyGems](https://www.rubygems.org) and [Bundler](https://bundler.io/) (available on most UNIX-like operating systems with [Ruby](https://www.ruby-lang.org) and RubyGems installed). See instructions for macOS below as macOS requires the [Ruby Version Manager](https://rvm.io) to manually install Ruby gems. See the Ruby and RubyGems documentation for installation on other operating systems.
The BaitsTools executables can be installed using [RubyGems](https://www.rubygems.org) and [Bundler](https://bundler.io/) (available on most UNIX-like operating systems with [Ruby](https://www.ruby-lang.org) and RubyGems installed). See instructions for macOS below as macOS requires the [Ruby Version Manager](https://rvm.io) to manually install Ruby gems. See the Ruby and RubyGems documentation for installation on other operating systems. Precompiled gems are available [here](https://github.com/campanam/BaitsTools/pkgs/rubygems/baitstools).

In a terminal window, execute the following commands:
After downloading the latest precompiled gem, execute the following command in a terminal window:

`gem install baitstools-1.8.0.gem`

To manually build and install the gem, execute the following commands in a terminal window:

`git clone https://github.com/campanam/baitstools`
`cd baitstools`
`gem build baitstools.gemspec`
`gem install baitstools-1.7.8.gem`
`gem install baitstools-1.8.0.gem`

### macOS Installation
macOS uses a deprecated version of Tcl-Tk as its default Tk framework. For best results, install [ActiveTcl 8.6](https://www.activestate.com/products/activetcl/downloads/) and then reinstall the tk gem (`gem install tk`). Tcl-Tk can also be installed using [Homebrew](https://brew.sh) or [Anaconda](https://anaconda.org/), but the windows are not optimized for these methods.
Expand All @@ -74,7 +78,7 @@ Enter the following commands (step annotations are provided after the highlighte
`git clone https://github.com/campanam/baitstools`: Download the BaitsTools repository.
`cd baitstools`: Enter the baitstools directory.
`gem build baitstools.gemspec`: Build the BaitsTools gem.
`gem install baitstools-1.7.5.gem`: Install the BaitsTools gem.
`gem install baitstools-1.8.0.gem`: Install the BaitsTools gem.

_macOS Installation Notes:_
1. The Ruby Version Manager uses [Homebrew](https://brew.sh). During installation you may need to give an administrator password and authorization to install/update Homebrew.
Expand Down Expand Up @@ -183,7 +187,8 @@ aln2baits generates baits from a DNA alignment in FASTA or FASTQ format. Bait se
`-i, --input [FILE]`: Input alignment file name. Include the path to the file if not in the current directory.
`-L, --length [VALUE]`: Requested bait length. Default is 120 bp.
`-O, --offset [VALUE]`: Offset (in bp) between tiled baits. Default is 60 bp.
`-H, --haplo [VALUE]`: Alignment window haplotype definition (`haplotype` or `variant`). `haplotype` will cause the program to identify all unique haplotypes within each bait tiling window observed in the data. `variant` will cause the program to generate all possible permutations of single nucleotide variants observed within the window. Default is `haplotype`.
`-H, --haplo [VALUE]`: Alignment window haplotype definition (`haplotype` or `variant`). `haplotype` will cause the program to identify all unique haplotypes within each bait tiling window observed in the data. `variant` will cause the program to generate random permutations of single nucleotide variants observed within the window. Default is `haplotype`.
`--maxvars [VALUE]`: Maximum number of variant permutations to retain within each alignment window when using the `variant` haplotype definition. Default is 24.

### annot2baits
annot2baits generates baits from an annotation file in GTF or GFF and a corresponding DNA sequence in FASTA or FASTQ format.
Expand Down Expand Up @@ -232,7 +237,8 @@ pyrad2baits selects variants and generates baits from a PyRAD/ipyrad loci file.
`-O, --offset [VALUE]`: Base pair offset between tiled baits. Default is 60 bp.
`-I, --minind [VALUE]`: Minimum number of individuals to include locus. Default is 1.
`-W, --strategy [VALUE]`: Strategy to generate baits from loci (`alignment`, `SNPs`, or `informative`). `alignment` treats the individual loci as FASTA alignments and passes the alignments to [aln2baits](#aln2baits) to generate weighted alignments. `SNPs` and `informative` select and generate baits for identified variable sites. `SNPs` includes all identified sites, whereas `informative` includes only phylogenetically informative sites. Default is `alignment`.
`-H, --haplo [VALUE]`: If using `alignment` strategy, alignment window haplotype definition (`haplotype` or `variant`). `haplotype` will cause the program to identify all unique haplotypes within each bait tiling window observed in the data. `variant` will cause the program to generate all possible permutations of single nucleotide variants observed within the window. Default is `haplotype`.
`-H, --haplo [VALUE]`: If using `alignment` strategy, alignment window haplotype definition (`haplotype` or `variant`). `haplotype` will cause the program to identify all unique haplotypes within each bait tiling window observed in the data. `variant` will cause the program to generate random permutations of single nucleotide variants observed within the window. Default is `haplotype`.
`--maxvars [VALUE]`: Maximum number of variant permutations to retain within each alignment window when using the `variant` haplotype definition. Default is 24.
`--uncollapsedref`: If using `SNPs` or `informative` strategies, choose a random reference sequence and keep ambiguities for each locus.
`-a, --alt`: If using `SNPs` or `informative` strategies, generate baits for alternate alleles.
`-t, --totalvars [VALUE]`: If using `SNPs` or `informative` strategies, total requested variants. Default is 30,000.
Expand Down
2 changes: 1 addition & 1 deletion baitstools.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |s|
s.name = 'baitstools'
s.version = '1.7.8'
s.version = '1.8.0'
s.required_ruby_version = '>= 2.4.1'
s.date = '2023-07-31'
s.summary = 'BaitsTools: Software for hybridization capture bait design'
Expand Down
17 changes: 16 additions & 1 deletion bin/baitstools
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env ruby
#-----------------------------------------------------------------------------------------------
# baitstools
BAITSTOOLSVER = "1.7.8"
BAITSTOOLSVER = "1.8.0"
# Michael G. Campana, 2017-2023
# Smithsonian's National Zoo and Conservation Biology Institute
#-----------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -74,6 +74,7 @@ class Parser
args.no_Ns = false # Flag to omit bait sequences with Ns
args.collapse_ambiguities = false # Flag to collapse ambiguities to a single nucleotide
args.haplodef = "haplotype" # Haplotype definition for aln2baits
args.maxvars = 24 # Maximum number of retained variant permutations per window for aln2baits
args.uncollapsed_ref = false # Flag to keep ambiguities in pyrad reference sequence
args.sort = false # Flag to sort stack2baits SNPs by between/within population variation
args.hwe = false # Flag to sort stacks2baits SNPs by Hardy-Weinberg Equilibrium
Expand Down Expand Up @@ -186,6 +187,9 @@ class Parser
opts.on("-H","--haplo [VALUE]", String, "If using alignment strategy, window haplotype definition (haplotype or variant) (Default = haplotype)") do |fa|
args.haplodef = fa.downcase if fa != nil
end
opts.on("--maxvars [VALUE]",Integer, "Maximum number of variant permutations per alignment window (Default = 24)") do |maxvars|
args.maxvars = maxvars if maxvars != nil
end
opts.on("--uncollapsedref","Keep ambiguities in pyrad2baits reference sequence") do
args.uncollapsed_ref = true
end
Expand Down Expand Up @@ -252,6 +256,9 @@ class Parser
opts.on("-H","--haplo [VALUE]", String, "Window haplotype definition (haplotype or variant) (Default = haplotype)") do |fa|
args.haplodef = fa if fa.downcase != nil
end
opts.on("--maxvars [VALUE]",Integer, "Maximum number of variant permutations per alignment window (Default = 24)") do |maxvars|
args.maxvars = maxvars if maxvars != nil
end
end
if args.algorithm == "blast2baits"
opts.on("--percid [VALUE]", Float, "Minimum percent identity to include BLAST hit (Default = 0.0)") do |percid|
Expand Down Expand Up @@ -837,6 +844,14 @@ begin
print "Please choose a haplotype definition (haplotype or variant)\n"
$options.haplodef = gets.chomp.downcase
end
if $options.interact and $options.haplodef == "variant"
print "Enter maximum number of variant permutations per alignment window to retain.\n"
$options.maxvars = gets.chomp.to_i
end
while $options.maxvars <= 0 and $options.haplodef == "variant"
print "Minimum number of retained variant permutations must be greater than 0. Re-enter.\n"
$options.maxvars = gets.chomp.to_i
end
end
if $options.algorithm == "blast2baits"
if $options.interact
Expand Down
43 changes: 40 additions & 3 deletions bin/baitstoolsgui
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env ruby
#-----------------------------------------------------------------------------------------------
# baitstoolsgui
BAITSTOOLSGUI = "1.7.8"
BAITSTOOLSGUI = "1.8.0"
# Michael G. Campana, 2017-2023
# Smithsonian's National Zoo and Conservation Biology Institute
#-----------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -58,6 +58,7 @@ def start_baitstools
end
if $options.algorithm == "aln2baits" or ($options.algorithm == "pyrad2baits" && $options.strategy == "alignment")
cmdline << " -H " + $options.haplodef
cmdline << " --maxvars " + $options.maxvars if $options.haplodef == "variant"
elsif $options.algorithm == "annot2baits"
cmdline << " -U " + $options.features.value.upcase
elsif $options.algorithm == "blast2baits"
Expand Down Expand Up @@ -235,6 +236,11 @@ def update_strategy
$alts.state = "disabled"
$uncollapsedref.state = "disabled"
$haplo.state = $haploselect.state = "normal"
if $options.haplodef == "variant"
$maxvars.state = $maxvarsentry.state = "normal"
else
$maxvars.state = $maxvarsentry.state = "disabled"
end
else
$maxsnpentry.state = $maxsnps.state = "normal"
$distanceentry.state = $distance.state = "normal"
Expand All @@ -244,6 +250,7 @@ def update_strategy
$alts.state = "normal"
$uncollapsedref.state = "normal"
$haplo.state = $haploselect.state = "disabled"
$maxvars.state = $maxvarsentry.state = "disabled"
end
end
#-----------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -421,7 +428,31 @@ def haplodef_window
height 2
place('x' => 240, 'y' => 260)
end
$widgets.push($haplo, $haploselect)
$haploselect.bind("<ComboboxSelected>") do
update_haplodef
end
$maxvars = TkLabel.new($root) do
text 'Maximum variants per window'
font TkFont.new('times 20')
place('x' => 400, 'y' => 250)
pady 10
end
$maxvarsentry = TkEntry.new($root) do
textvariable $options.maxvars
borderwidth 5
font TkFont.new('times 12')
place('x' => 660, 'y' => 260)
width 10
end
$widgets.push($haplo, $haploselect,$maxvars,$maxvarsentry)
end
#-----------------------------------------------------------------------------------------------
def update_haplodef
if $options.haplodef == "variant"
$maxvars.state = $maxvarsentry.state = "normal"
else
$maxvars.state = $maxvarsentry.state = "disabled"
end
end
#-----------------------------------------------------------------------------------------------
def reference_window(winy = 150)
Expand Down Expand Up @@ -828,6 +859,7 @@ def subcommand_window(subcommand)
offset_window
haplodef_window
inputlabel = "Input FASTA/FASTQ"
update_haplodef
when "annot2baits"
reference_window
pad_window
Expand Down Expand Up @@ -1301,6 +1333,8 @@ def go_forward
Tk::messageBox :message => 'Please specify an input file.'
elsif ($options.algorithm == "annot2baits" or $options.algorithm == "bed2baits" or $options.algorithm == "blast2baits") && $options.refseq == ""
Tk::messageBox :message => 'Please specify a reference sequence.'
elsif $options.algorithm == "aln2baits" && $options.strategy == "alignment" && $options.haplodef == "variant" && $options.maxvars < 1
Tk::messageBox :message => 'Maximum number of variant permutations per window must be greater than 0.'
elsif ($options.algorithm == "annot2baits" or $options.algorithm == "bed2baits" or $options.algorithm == "blast2baits") && $options.pad < 0
Tk::messageBox :message => 'Pad length cannot be less than 0.'
elsif $options.baitlength < 1
Expand All @@ -1325,6 +1359,8 @@ def go_forward
Tk::messageBox :message => 'Tiling offset must be greater than 0.'
elsif $options.minind < 1
Tk::messageBox :message => 'Minimum individuals must be greater than 0.'
elsif $options.strategy == "alignment" && $options.haplodef == "variant" && $options.maxvars < 1
Tk::messageBox :message => 'Maximum number of variant permutations per window must be greater than 0.'
elsif $options.strategy != "alignment"
if $options.totalsnps < 1
Tk::messageBox :message => 'The total number of variants must be greater than 0.'
Expand Down Expand Up @@ -1532,6 +1568,7 @@ def set_defaults
$options.tileoffset = TkVariable.new(60) # Offset between tiled baits
$options.bait_type = TkVariable.new("RNA-DNA") # Hybridization type
$options.haplodef = TkVariable.new("haplotype") # Haplotype definition for aln2baits
$options.maxvars = TkVariable.new(24) # Maximum number of variant permutations per alignment window
$options.list_format = TkVariable.new("BED") # Interval list file format
$options.features = TkVariable.new("") # Desired features in comma-separated list
$options.pad = TkVariable.new(0) # BP to pad ends of extracted regions
Expand Down Expand Up @@ -1635,7 +1672,7 @@ $next_btn = TkButton.new($root) do
place('x' => 660, 'y' => 520)
end
credit = TkLabel.new($root) do
text "Michael G. Campana, 2017-2022\nSmithsonian's National Zoo and Conservation Biology Institute"
text "Michael G. Campana, 2017-2023\nSmithsonian's National Zoo and Conservation Biology Institute"
borderwidth 5
font TkFont.new('times 12')
pack("side" => "bottom", "padx"=> "50", "pady"=> "10")
Expand Down
6 changes: 4 additions & 2 deletions lib/aln2baits.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env ruby
#-----------------------------------------------------------------------------------------------
# aln2baits
ALN2BAITSVER = "1.7.8"
ALN2BAITSVER = "1.8.0"
# Michael G. Campana, 2017-2023
# Smithsonian Conservation Biology Institute
#-----------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -40,7 +40,9 @@ def var_permutations(aln) # Get possible variant permutations
varindex = 1 # Index for sequence
for var in variants
varindex *= var.size # Must be complete down here or interferes with multithreading
break if varindex > $options.maxvars # Control unnecessary extra processing if too many variants requested
end
varindex = $options.maxvars if varindex > $options.maxvars
revised_haplos = []
bedstarts = self.bedstarts[0] # Reset bedstart array and assume coordinates of first array member
self.bedstarts = []
Expand All @@ -54,7 +56,7 @@ def var_permutations(aln) # Get possible variant permutations
for Thread.current[:k] in 0...varindex
if Thread.current[:k] % $options.threads == j
for Thread.current[:i] in 0...self.haplotypes[0].length
Thread.current[:var] = Thread.current[:k] % variants[Thread.current[:i]].size
Thread.current[:var] = rand(variants[Thread.current[:i]].size)
revised_haplos[Thread.current[:k]] << variants[Thread.current[:i]][Thread.current[:var]] # Minimize lock time
end
if $options.gaps == "extend"
Expand Down
5 changes: 3 additions & 2 deletions lib/baitslib.rb
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#!/usr/bin/env ruby
#-----------------------------------------------------------------------------------------------
# baitslib
BAITSLIBVER = "1.7.5"
# Michael G. Campana, 2017-2022
BAITSLIBVER = "1.8.0"
# Michael G. Campana, 2017-2023
# Smithsonian's National Zoo and Conservation Biology Institute
#-----------------------------------------------------------------------------------------------

Expand Down Expand Up @@ -1010,6 +1010,7 @@ def get_command_line # Get command line for summary output
end
if $options.algorithm == "aln2baits" or ($options.algorithm == "pyrad2baits" && $options.strategy == "alignment")
cmdline << " -H " + $options.haplodef
cmdline << " --maxvars " + $options.maxvars.to_s if $options.haplodef == "variant"
elsif $options.algorithm == "annot2baits"
cmdline << " -U "
for feature in $options.features
Expand Down
4 changes: 2 additions & 2 deletions osx_install.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash
#-----------------------------------------------------------------------------------------------
# osx_install v 1.7.8
# osx_install v 1.8.0
# Michael G. Campana, 2017-2023
# Smithsonian's National Zoo and Conservation Biology Institute
#-----------------------------------------------------------------------------------------------
Expand All @@ -10,4 +10,4 @@ source ~/.rvm/scripts/rvm
rvm install 3.1.2
rvm --default use 3.1.2
gem build baitstools.gemspec
gem install ./baitstools-1.7.8.gem
gem install ./baitstools-1.8.0.gem

0 comments on commit da61ef0

Please sign in to comment.