Skip to content

Releases: GilesStrong/lumin

v0.9.1

29 Oct 08:05
Compare
Choose a tag to compare

v0.9.0

29 Oct 07:38
2a47d1d
Compare
Choose a tag to compare

Major updates to dependencies and moves to poetry for build and packaging

What's Changed

Full Changelog: v0.8.1...v0.9.0

v0.8.1

29 Oct 06:45
Compare
Choose a tag to compare

N.B. Last version to use setuptools for building, subsequent versions use Poetry

What's Changed

New Contributors

Full Changelog: v0.8.0...v0.8.1

v0.8.0 - Mistake not...

07 Jul 15:47
Compare
Choose a tag to compare

v0.8.0 - Mistake not...

Important changes

  • GNN architectures generalised into feature extraction and graph collapse stages, see details below and updated tutorial

Breaking

Additions

  • GravNet GNN head and GravNetLayer sub-block Qasim, Kieseler, Iiyama, & Pierini, 2019
    • Includes optional self-attention
  • SelfAttention and OffsetSelfAttention
  • Batchnorm:
    • LCBatchNorm1d to run batchnorm over length x channel data
    • Additional bn_class arguments to blocks, allowing the user to choose different batchnorm implementations
    • 1, 2, & 3D Running batchnorm layers from fastai (https://github.com/fastai/course-v3)
  • GNNHead encapsulating head for feature extraction, using AbsGraphFeatExtractor classes, and graph collapsing, using GraphCollapser classes
  • New callbacks:
    • AbsWeightData to weight folds of data based on their inputs or targets
    • EpochSaver to save the model to a new file at the end of every epoch
    • CycleStep combines OneCycle and step-decay of optimiser hyper-parameters
  • New CNN blocks:
    • AdaptiveAvgMaxConcatPool1d, AdaptiveAvgMaxConcatPool2d, AdaptiveAvgMaxConcatPool3d use average and maximum pooling to reduce data to specified number sizes per channel
    • SEBlock1d, SEBlock2d, SEBlock3d apply squeeze-excitation to data channels
  • BackwardHook for recording telemetric data during backwards passes
  • New losses:
    • WeightedFractionalMSE, WeightedBinnedHuber, WeightedFractionalBinnedHuber
  • Options for log x & y axis in plot_feat

Removals

  • Scheduled removal of depreciated methods and functions from old model and callback system:
    • OldAbsCallback
    • OldCallback
    • OldAbsCyclicCallback
    • OldCycleLR
    • OldCycleMom
    • OldOneCycle
    • OldBinaryLabelSmooth
    • OldBinaryLabelSmooth
    • SequentialReweight
    • SequentialReweightClasses
    • OldBootstrapResample
    • OldParametrisedPrediction
    • OldGradClip
    • OldLsuvInit
    • OldAbsModelCallback
    • OldSWA
    • OldLRFinder
    • OldEnsemble
    • OldAMS
    • OldMultiAMS
    • OldBinaryAccuracy
    • OldRocAucScore
    • OldEvalMetric
    • OldRegPull
    • OldRegAsProxyPull
    • OldAbsModel
    • OldModel
    • fold_train_ensemble
    • OldMetricLogger
    • fold_lr_find
    • old_plot_train_history
    • _get_folds
  • Unnecessary pred_cb argument in train_models

Fixes

  • Bug when trying to use batchnorm in InteractionNet
  • Bug in FoldFile.save_fold_pred when predictions change shape and try to overwrite existing predictions

Changes

  • padding argument in conv 1D blocks renamed to pad
  • Graph nets: generalised into feature extraction for features per vertex and graph collapsing down to flat data (with optional self-attention)
  • Renamed FowardHook to ForwardHook
  • Abstract classes no longer inherit from ABC, but rather have metaclass=ABCMeta in order to be compatible with py>=3.7
  • Updated the example of binary classification of signal & background to use the model and training resulting from https://iopscience.iop.org/article/10.1088/2632-2153/ab983a
    • Also changed the multi-target regression example to use non-densely connected layers, and the multi-target classification example to use a cosine annealed cyclical LR
  • Updated the single-target regression example to use WeightedBinnedHuber as a loss
  • Changed from torch.tensor import Tensor to from torch import Tensor for compatibility with latest PyTorch

Depreciations

  • OldInteractionNet replaced in favour of InteractionNet feature extractor. Will be removed in v0.9

v0.7.2 - All your batch are belong to us - Micro Update

11 Mar 13:50
Compare
Choose a tag to compare

v0.7.2 - All your batch are belong to us - Micro Update

Important changes

  • Fixed bug in Model.set_mom which resulted in momentum never being set (affects e.g. OneCycle and CyclicalMom)
  • Model.fit now shuffles the fold indices for training folds prior to each epoch rather than once per training; removes the periodicity in training loss which was occasionally apparent.
  • Bugs found in OneCycle:
    • When training multiple models, the initial LR for subsequent models was the end LR of the previous model (list in partial was being mutated)
    • The model did not stop training at end of cycle
    • Momentum was never altered in the optimiser

Breaking

Additions

  • Mish activation function
  • Model.fit_params.val_requires_grad to control whether to compute validation epoch with gradient, default zero, built some losses might require it in the future
  • ParameterisedPrediction now stores copies of values for parametrised features in case they change, or need to be changed locally during prediction.
  • freeze_layers and unfreeze_layers methods for Model
  • PivotTraining callback implementing Learning to Pivot Louppe, Kagan, & Cranmer, 2016
    • New example reimplementing paper's jets example
  • TargReplace callback for replacing target data in BatchYielder during training
  • Support for loss functions being fastcore partialler objects
  • train_models now has arguments to:
    • Exclude specific fold indices from training and validation
    • Train models on unique folds, e.g. when training 5 models on a file with 10 folds, each model would be trained on their own unique pair of folds
  • Added discussion of core concepts in LUMIN to the docs

Removals

Fixes

  • Cases in which a NaN in the metric during training could spoil plotting and SaveBest
  • Bug in Model.set_mom which resulted in momentum never being set (affects e.g. OneCycle and CyclicalMom)
  • Bug in MetricLogger.get_results where tracking metrics could be spoilt by NaN values
  • Bug in train when not passing any metrics
  • Bug in FoldYielder when loading output pipe from Path
  • Bugs found in OneCycle:
    • When training multiple models, the initial LR for subsequent models was the end LR of the previous model (list in partial was being mutated)
    • The model did not stop training at end of cycle
    • Momentum was never altered in the optimiser

Changes

  • Model.fit now shuffles the fold indices for training folds prior to each epoch rather than once per training; removes the periodicity in training loss which was occasionally apparent.
  • Validation and prediction forwards passes now performed without gradient tracking to save memory and time
  • MetricLogger now records loss values on batch end rather than on forwards end
  • on_batch_end now always called regardless of model state

Depreciations

Comments

v0.7.1 - All your batch are belong to us - Micro Update

15 Dec 13:44
Compare
Choose a tag to compare

v0.7.1

Important changes

  • EvalMetrics revised to inherit from Callback and be called on validation data after every epoch. User-written EvalMetrics willneed to be adjusted to work with the new calling method: adjust evaluate method and constructor may need to be adjusted; see existing metrics to see how.

Breaking

  • eval_metrics argument in train_models renamed to metric_partials and now takes a list of partial EvalMetrics
  • User-written EvalMetrics will need to be adjusted to work with the new calling method: adjust evaluate method and constructor may need to be adjusted; see existing metrics to see how.

Additions

  • OneCycle now has a cycle_ends_training which allows training to continue at the final LR and Momentum. keeping at default of True ends the training once the cycle is complete, as usual.
  • to_np now returns None when input tensor is None
  • plot_train_history now plots metric evolution for validation data

Removals

Fixes

  • Model now creates cb_savepath is it didn't already exist
  • Bug in PredHandler where predictions were kept on device leading to increased memory usage
  • Version issue in matplotlib affecting plot positioning

Changes

Depreciations

  • V0.8:
    • All EvalMetrics depreciated with metric system. They have been copied and renamed to Old* for compatibility with the old model training system.
    • OldEvalMetric: Replaced by EvalMetric
    • OldMultiAMS: Replaced by MultiAMS
    • OldAMS: Replaced by AMS
    • OldRegPull: Replaced by RegPull
    • OldRegAsProxyPull: Replaced by RegAsProxyPull
    • OldRocAucScore: Replaced by RocAucScore
    • OldBinaryAccuracy: Replaced by BinaryAccuracy

Comments

v0.7.0 - All your batch are belong to us

12 Nov 16:57
Compare
Choose a tag to compare

v0.7.0 - All your batch are belong to us

Important changes

  • Model training and callbacks have significantly changed:
    • Model.fit now expects to perform the entire training proceedure, rather than just single epochs.
    • A lot of the functionality of the old training method fold_train_ensemble is now delegated to Model.fit.
    • A new ensemble training method train_models has replaced fold_train_ensemble. It provied a similar API, but aims to be more understandable to users.
    • Model.fit is now 'stateful': a fit_params class is created containing all the information and data relevant to training the model and trainig methods change their actions according to fit_params.state ('train', 'valid', and 'test')
    • Callbacks now have greater potential: They have more action points during the training cycle, where they can affect training behaviour, and they have access to fit_params, allowing them to modify more aspects of the training and have indirect access to all other callbacks.
    • The "tick" for the training loop is now one epoch, i.e. validation loss is computed after the entire use of the training data (as opposed to after every sub-epoch), cyclic callbacks now work on the scale of epochs, rather than sub-epochs. Due to the data being split into folds, the concept of a sup-epoch still exists, but the APIs are now simplified for the user (previously they were a mixture of sup-epoch and epoch arguments).
    • For users who do not wish to transition to the new model behaviour, the existing behaviour can still be achieved by using the Old* models and classes. See the depreciations section for the full list.
  • Input masks (present if e.g using feature subsampling in ModelBuilder`)
    • BatchYielder now takes an input_mask argument to filter inputs
    • Model prediction methods no longer take input mask arguments, instead the input mask (if present) is automatically used. If users have already filtered their data, they should manually remove the input mask from the model (i.e. set it to None)
  • Callbacks which take arguments related to (sub-)epochs (e.g. cycle length, scale, time to renewal. etc. for CycleLR, OneCycle, etc. and SWA) now take these arguments in terms of epochs. I.e. a OneCycle schedule with 9 training folds, running for 15 epochs would previously require e.g. lenghts=(45,90) in order to complete the cycle in 15 epochs (135 subepochs). Now it is specified as simply lenghts=(5,10). Additionally, these arguments must be integers. Floats will be coerced to integers with warning.
  • lr_find now runds over all training folds, instead of just 1

Breaking

  • Heavy renaming of methods and classes due to changes in model trainng and callbacks.

Additions

  • __del__ method to FowardHook class
  • BatchYielder:
    • Now takes an input_mask argument to filter inputs
    • Now takes an argument allowing incomplete batches to be yielded
    • Target array can now be None
  • Model:
    • now takes a bs argument for evaluate
    • predictions can now be modified by passing a PredHandler callback to pred_cb. The default one simply returns the model predicitons, however other actions could be defined by the user, e.g. performing argmax for multiclass classifiers.

Removals

  • Model:
    • Now no longer takes callbacks and mask_inputs as arguments for evaluate
    • evaluate_from_by removed, just call evaluate
  • Callbacks no longer take model and plot_settings arguments during initialisation. These should be added by calling the relevant setters. Model will call them when relevant.

Fixes

  • Potential bug in convolutional models where checking the out size of the head would affect the batchnorm averaging
  • Potential bug in plot_sample_pred to do with bin ranges
  • ForwardHook not working with passed hook functions

Changes

  • BinaryLabelSmooth now only applies smoothing during training and not in validation
  • Ensemble
    • from_results and build_ensemble now no longer take location as an argument. Instead, results should contain the savepath for the models
    • _build_ensemble is now private
  • Model:
    • predict_array and predict_folds are now private
    • fit now expects to perform the entire fitting of the model, rather than just one sup-epoch. Additionally, validation loss is now computed only at the end of the epoch, rather that previously where it was computed after each fold.
  • SWA renewal_period should now be None in order to prevent a second average being tracked (previously was negative)
  • Some examples have been renamed, and copies using the old model fitting proceedure and old callbacks are available in examples/old
  • lr_find now runds over all training folds, instead of just 1

Depreciations

  • V0.8:
    • Many classes and methods depreciated with new model. They have been copied and renamed to Old*.
    • OldAbsModel: Replaced by AbsModel
    • OldModel: Replaced by Model
    • OldAbsCallback: Replaced by AbsCallback
    • OldCallback: Replaced by Callback
    • OldBinaryLabelSmooth: Replaced by BinaryLabelSmooth
    • OldSequentialReweight: Will not be replaced
    • SequentialReweightClasses: Will not be replaced
    • OldBootstrapResample: Replaced by BootstrapResample
    • OldParametrisedPrediction: Replaced by ParametrisedPrediction
    • OldGradClip: Replaced by GradClip
    • OldLsuvInitL Replaced by LsuvInit
    • OldAbsCyclicCallback: Replaced by AbsCyclicCallback
    • OldCycleLR: Replaced by CycleLR
    • OldCycleMom: Replaced by CycleMom
    • OldOneCycle: Replaced by OneCycle
    • OldLRFinder: Replaced by LRFinder
    • fold_lr_find: Replaced by lr_find
    • fold_train_ensemble: Replaced by train_models
    • OldMetricLogger: Replaced by MetricLogger
    • AbsModelCallback: Will not be replaced
    • OldSWA: Replaced by SWA
    • old_plot_train_history: Replaced by plot_train_history
    • OldEnsemble: Replaced by Ensemble

Comments

v0.6.0 - Train and Converge Until it is Done

09 Sep 14:22
Compare
Choose a tag to compare

v0.6.0 - Train and Converge Until it is Done

Important changes

  • auto_filter_on_linear_correlation now examines all features within correlated clusters, rather than just the most correlated pair. This means that the function now only needs to be run once, rather than the previously recommended multiple rerunning.
  • Moved to Scikit-learn 0.22.2, and moved, where possible, to keyword argument calls for sklearn methods in preparation for 0.25 enforcement of keyword arguments
  • Fixed error in patience when using cyclical LR callbacks, now specify the number of cycles to go without improvement. Previously had to specify 1+number.
  • Matrix data is no longer passed through np.nan_to_num in FoldYielder. Users should ensure that all values in matrix data are not NaN or Inf
  • Tensor data:
    • df2foldfile, fold2foldfile, and 'add_meta_data` can now support the saving of arbitrary matrices as a matrix input
    • Pass a numpy.array whose first dimension matches the length of the DataFrame to the tensor_data argument of df2foldfile and a name to tensor_name.
      The array will be split along the first dimension and the sub-arrays will be saved as matrix inputs in the resulting foldfile
    • The matrices may also be passed as sparse format and be densified on loading by FoldYielder

Breaking

  • plot_rank_order_dendrogram now returns sets of all features in cluster with distance over the threshold, rather than just the closest features in each cluster

Additions

  • Addition of batch size parameter to Ensemble.predict*
  • Lorentz Boost Network (https://arxiv.org/abs/1812.09722):
    • LorentzBoostNet basic implementation which learns boosted particles from existing particles and extracts features from them using fixed kernel functions
    • AutoExtractLorentzBoostNet which also learns the kernel-functions during training
  • Classification Eval classes:
    • BinaryAccuracy: Computes and returns the accuracy of a single-output model for binary classification tasks.
    • RocAucScore: Computes and returns the area under the Receiver Operator Characteristic curve (ROC AUC) of a classifier model.
  • plot_binary_sample_feat: a version of plot_sample_pred designed for plotting feature histograms with stacked contributions by sample for
    background.
  • Added compression arguments to df2foldfile, fold2foldfile, and save_to_grp
  • Tensor data:
    • df2foldfile, fold2foldfile, and 'add_meta_data` can now support the saving of arbitrary matrices as a matrix input
    • Pass a numpy.array whose first dimension matches the length of the DataFrame to the tensor_data argument of df2foldfile and a name to tensor_name.
      The array will be split along the first dimension and the sub-arrays will be saved as matrix inputs in the resulting foldfile
    • The matrices may also be passed as sparse format and be densified on loading by FoldYielder
  • plot_lr_finders now has a log_y argument for logarithmic y-axis. Default auto set log_y if maximum fractional difference between losses is greater than 50
  • Added new rescaling options to ClassRegMulti using linear outputs and scaling by mean and std of targets
  • LsuvInit now applies scaling to nn.Conv3d layers
  • plot_lr_finders and fold_lr_find now have options to save the resulting LR finder plot (currently limited to png due to problems with pdf)
  • Addition of AdamW and an optimiser, thanks to @kiryteo
  • Contribution guide, thanks to @kiryteo
  • OneCycle lr_range now supports a non-zero final LR; just supply a three-tuple to the lr_range argument.
  • Ensemble.from_models classmethod for combining in-memory models into an Ensemble.

Removals

  • FeatureSubsample
  • plots keyword in fold_train_ensemble

Fixes

  • Docs bug for nn.training due to missing ipython in requirements
  • Bug in LSUV init when running on CUDA
  • Bug in TF export based on searching for fullstops
  • Bug in model_bar update during fold training
  • Quiet bug in 'MultHead' when matrix feats were not listed first; map construction indexed self.matrix_feats not self.feats
  • Slowdown in ensemble.predict_array which caused the array to get sent to device in during each model evaluations
    -Model.get_param_count now includes mon-trainable params when requested
  • Fixed bug in fold_lr_find where LR finders would use different LR steps leading to NaNs when plotting in fold_lr_find
  • plot_feat used to coerce NaNs and Infs via np.nan_to_num prior to plotting, potentially impacting distributions, plotting scales, moments, etc. Fixed so that nan and inf values are removed rather than coerced.
  • Fixed early-stopping statement in fold_train_ensemble to state the number as "sub-epochs" (previously said "epochs")
  • Fixed error in patience when using cyclical LR callbacks, now specify the number of cycles to go without improvement. Previously had to specify 1+number.
  • Unnecessary warning df2foldfile when no strat-key is passed.
  • Saved matrices in fold2foldfile are now in float32
  • Fixed return type of get_layers methods in RNNs_CNNs_and_GNNs_for_matrix_data example
  • Bug in model.predict_array when predicting matrix data with a batch size
  • Added missing indexing in AbsMatrixHead to use torch.bool if PyTorch version is >= 1.2 (was uint8 but now depreciated for indexing)
  • Errors when running in terminal due to trying to call .show on fastprogress bars
  • Bug due to encoding of readme when trying to install when default encoder is ascii
  • Bug when running Model.predict in batches when the data contains less than one batch
  • Include missing files in sdist, thanks to @thatch
  • Test path correction in example notebook, thanks to @kiryteo
  • Doc links in hep_proc
  • Error in MultiHead._set_feats when matrix_head does not contain 'vecs' or 'feats_per_vec' keywords
  • Compatibility error in numpy >= 1.18 in bin_binary_class_pred due to float instead of int
  • Unnecessary second loading of fold data in fold_lr_find
  • Compatibility error when working in PyTorch 1.6 based on integer and true division
  • SWA not evaluating in batches when running in non-bulk-move mode
  • Moved from normed to density keywords for matplotlib

Changes

  • ParametrisedPrediction now accepts lists of parameterisation features
  • plot_sample_pred now ensures that signal and background have the same binning
  • PlotSettings now coerces string arguments for savepath to Path
  • Added default value for targ_name in EvalMetric
  • plot_rank_order_dendrogram:
    • Now uses "optimal ordering" for improved presentation
    • Now returns sets of all features in cluster with distance over the threshold, rather than just the closest features in each cluster
  • auto_filter_on_linear_correlation now examines all features within correlated clusters, rather than just the most correlated pair. This means that the function now only needs to be run once, rather than the previously recommended multiple rerunning.
  • Improved data shuffling in BatchYielder, now runs much quicker
  • Slight speedup when loading data from foldfiles
  • Matrix data is no longer passed through np.nan_to_num in FoldYielder. Users should ensure that all values in matrix data are not NaN or Inf

Depreciations

Comments

  • RFPImp still imports from sklearn.ensemble.forest which is depreciated, and possibly part of the private API. Hopefully the package will remedy this in time for depreciation. For now, future warnings are displayed.

v0.5.1 - The Gradient Must Flow - Micro Update

12 Feb 13:17
Compare
Choose a tag to compare

v0.5.1 - The Gradient Must Flow - Micro Update

Important changes

  • New live plot for losses during training (MetricLogger):
    • Provides additional information
    • Only updates after every epoch (previously every subepoch) reducing training times
    • Nicer appearance and automatic log scale for y-axis

Breaking

Additions

  • New live plot for losses during training (MetricLogger):
    • Provides additional information
    • Only updates after every epoch (previously every subepoch) reducing training times
    • Nicer appearance and automatic log scale for y-axis

Removals

Fixes

  • Fixed error in documentation which removed the ToC for the nn module

Changes

Depreciations

  • plots argument in fold_train_ensemble. The plots argument is now depreciated and ignored. Loss history will always be shown, lr history will no longer be shown separately, and live feedback is now controlled by the four live_fdbk arguments. This argument will be removed in V0.6.

Comments

v0.5 The Gradient Must Flow

10 Feb 11:32
Compare
Choose a tag to compare

V0.5 The Gradient Must Flow

Important changes

  • Added support for processing and embedding of matrix data
    • MultiHead to allow the use of multiple head blocks to handle input data containing flat and matrix inputs
    • AbsMatrixHead abstract class for head blocks designed to process matrix data
    • InteractionNet a new head block to apply interaction graph-nets to objects in matrix form
    • RecurrentHead a new head block to apply recurrent layers (RNN, LSTM, GRU) to series objects in matrix form
    • AbsConv1dHead a new abstract class for building convolutional networks from basic blocks to apply to object in matrix form.
  • Meta data:
    • FoldYielder now checks its foldfile for a meta_data group which contains information about the features and inputs in the data
    • cont_feats and cat_feats now no longer need to be passed to FoldYielder during initialisation of the foldfile contains meta data
    • add_meta_data function added to write meta data to foldfiles and is automatically called by df2foldfile
  • Improved usage with large datasets:
    • AddedModel.evaluate_from_by to allow batch-wise evaluation of loss
    • bulk_move in fold_train_ensemble now also affects the validation fold, i.e. bulk_move=False no longer preloads the validation fold, and validation loss is evaluated using Model.evaluate_from_by
    • bulk_move arguments added to fold_lr_find
    • Added batch-size argument to Model predict methods to run predictions in batches

Potentially Breaking

  • FoldYielder.get_df() now returns any NaNs present in data rather than zeros unless nan_to_num is set to True
  • Zero bias init for bottlenecks in MultiBlock body

Additions

  • __repr__ of Model now detail information about input variables
  • Added support for processing and embedding of matrix data
    • MultiHead to allow the use of multiple head blocks to handle input data containing flat and matrix inputs
    • AbsMatrixHead abstract class for head blocks designed to process matrix data
    • InteractionNet a new head block to apply interaction graph-nets to objects in matrix form
    • RecurrentHead a new head block to apply recurrent layers (RNN, LSTM, GRU) to series objects in matrix form
    • AbsConv1dHead a new abstract class for building convolutional networks from basic blocks to apply to object in matrix form.
  • Meta data:
    • FoldYielder now checks its foldfile for a meta_data group which contains information about the features and inputs in the data
    • cont_feats and cat_feats now no longer need to be passed to FoldYielder during initialisation of the foldfile contains meta data
    • add_meta_data function added to write meta data to foldfiles and is automatically called by df2foldfile
  • get_inputs method to BatchYielder to return the inputs, optionally on device
  • Added LSUV initialisation, implemented by LsuvInit callback

Removals

Fixes

  • FoldYielder.get_df() now returns any NaNs present in data rather than zeros unless nan_to_num is set to True
  • Various typing fixes`
  • Body and tail modules not correctly freezing
  • Made Swish to not be inplace - seemed to cause problems sometimes
  • Enforced fastprogress version; latest version renamed a parameter
  • Added support to df2foldfile for missing strat_key
  • Added support to fold2foldfile for missing features
  • Zero bias init for bottlenecks in MultiBlock body

Changes

  • Slight optimisation in FullyConnected when not using dense or residual networks
  • FoldYielder.set_foldfile is now a private function FoldYielder._set_foldfile
  • Improved usage with large datasets:
    • AddedModel.evaluate_from_by to allow batch-wise evaluation of loss
    • bulk_move in fold_train_ensemble now also affects the validation fold, i.e. bulk_move=False no longer preloads the validation fold, and validation loss is evaluated using Model.evaluate_from_by
    • bulk_move arguments added to fold_lr_find
    • Added batch-size argument to Model predict methods to run predictions in batches

Depreciations

Comments