v2.0.0
Switched from tensorflow to pytorch.
Existing models for recent basecallers have been converted to the new format.
Pytorch format models contain a _pt
suffix in the filename.
Changed
- Inference is now performed using PyTorch instead of TensorFlow.
- The
medaka consensus
command has been renamed tomedaka inference
to reflect
its function in running an arbitrary model and avoid confusion withmedaka_consensus
. - The
medaka stitch
command has been renamed tomedaka sequence
to reflect its
function in creating a consensus sequence. - The
medaka variant
command has been renamed tomedaka vcf
to reflect its function
in consolidating variants and avoid confusion withmedaka_variant
. - Order of arguments to
medaka vcf
has been changed to be more consistent
withmedaka sequence
. - The helper script
medaka_haploid_variant
has been renamedmedaka_variant
to
save typing. - Make
--ignore_read_groups
option available to more medaka subcommands includinginference
.
Removed
- The
medaka snp
command has been removed. This was long defunct as diploid SNP calling
had been deprecated, andmedaka variant
is used to create VCFs for current models. - Loading models in hdf format has been deprecated.
- Deleted minimap2 and racon wrappers in
medaka/wrapper.py
.
Added
- Release conda packages for Linux (x86 and aarch64) and macOS (arm64).
- Option
--lr_schedule
allows using cosine learning rate schedule in training. - Option
--max_valid_samples
to set number of samples in a training validation batch.
Fixed
- Training models with DiploidLabelScheme uses categorical cross-entropy loss
instead of binary cross-entropy.