Releases: cornellius-gp/gpytorch
Releases · cornellius-gp/gpytorch
GPyTorch 1.0.0 -- Stable release, major new features
Major New Features and Improvements
Each feature in this section comes with a new example notebook and documentation for how to use them -- check the new docs!
- Added support for deep gaussian processes (#564).
- KeOps integration has been added -- replace certain
gpytorch.kernels.SomeKernel
withgpytorch.kernels.keops.SomeKernel
with KeOps installed, and run exact GPs on 100000+ data points (#812). - Variational inference has undergone significant internal refactoring! All old variational objects should still function, but many are deprecated. (#903).
- Our integration with Pyro has been completely overhauled and is now much improved. For examples of interesting GP + Pyro models, see our new examples (#903).
- Our example notebooks have been completely reorganized, and our documentation surrounding them has been rewritten to hopefully provide a better tutorial to GPyTorch (#954).
- Added support for fully Bayesian GP modelling via NUTS (#918).
Minor New Features and Improvements
GridKernel
andGridInterpolationKernel
now support rectangular grids (#888).- Added cylindrical kernel (#577).
- Added polynomial kernel (#668).
- Added tutorials on basic usage (hyperparameters, saving/loading, etc) (#685).
get_fantasy_model
now supports batched models (#693).- Added a
prior_mode
context manager that causes GP models to evaluate in prior mode (#707). - Added linear mean (#676).
- Added horseshoe prior (#719).
- Added polynomial kernel with derivatives (#783).
- Fantasy model computations now use QR for solving least squares problems, improving numerical stability (#790).
- All legacy functions have been removed, in favor of new function format in PyTorch (#799).
- Added Newton Girard kernel (#821).
- GP predictions now automatically clear caches when backpropagating through them. Previously, if you wanted to train through a GP in eval mode, you had to clear the caches manually by toggling the GP back to train mode and then to eval mode again. This is no longer necessary (#916).
- Added rational quadratic kernel (#330)
- Switch to using
torch.cholesky_solve
andtorch.logdet
now that they support batch mode / backwards (#880) - Better / less redundant parameterization for correlation matrices e.g. in
IndexKernel
(#912). - Kernels now define
__getitem__
, which allows slicing batch dimensions (#782). - Performance improvements in the small data regime, e.g. n < 2000 (#926).
- Increased the size of kernel matrix for which Cholesky is the default solve strategy to n=800 (#946).
- Added an option for manually specifying a different preconditioner for
AddedDiagLazyTensor
(#930). - Added precommit hooks that enforce code style (#927).
- Lengthscales have been refactored, and kernels have an
is_stationary
attribute (#925). - All of our example notebooks now get smoke tested by our CI.
- Added a
deterministic_probes
setting that causes our MLL computation to be fully deterministic when using CG+Lanczos, which improves L-BFGS convergence (#929). - The use of the Woodbury formula for preconditioner computations is now fully replaced by QR, which improves numerical stability (#968).
Bug fixes
- Fix a type error when calling
backward
ongpytorch.functions.logdet
(#711). - Variational models now properly skip posterior variance calculations if the
skip_posterior_variances
context is active (#741). - Fixed an issue with
diag
mode forPeriodicKernel
(#761). - Stability improvements for
inv_softplus
andinv_sigmoid
(#776). - Fix incorrect size handling in
InterpolatedLazyTensor
for rectangular matrices (#906) - Fix indexing in
IndexKernel
for batch mode (#911). - Fixed an issue where slicing batch mode lazy covariance matrices resulted in incorrect behavior (#782).
- Cholesky gives a better error when there are NaNs (#944).
- Use
psd_safe_cholesky
in prediction strategies rather thantorch.cholesky
(#956). - An error is now raised if Cholesky is used with KeOps, which is not supported (#959).
- Fixed a bug where NaNs could occur during interpoilation (#971).
- Fix MLL computation for heteroskedastic noise models (#870).
Last release before 0.4 (and last PyTorch 1.2 compatible release)
A full list of bug fixes and features will be out with the 0.4 release.
Support for PyTorch 1.2
This release addresses breaking changes in the recent PyTorch 1.2 release. Currently, GPyTorch will run on either PyTorch 1.1 or PyTorch 1.2.
A full list of new features and bug fixes will be coming soon in a GPyTorch 0.4 release.
v0.3.4a
0.3.4a
Large scale exact GPs, Multibatch support, Performance and stability improvements
New Features
- Implement kernel checkpointing, allowing exact GPs on up to 1M data points with multiple GPUs (#499)
- GPyTorch now supports hard parameter constraints (e.g. bounds) via the register_constraint method on
Module
(#596) - All GPyTorch objects now support multiple batch dimensions. In addition to training
b
GPs simultaneously, you can now train ab1 x b2
matrix of GPs simultaneously if you so choose (#492, #589, #627) RBFKernelGrad
now supports ARD (#602)FixedNoiseGaussianLikelihood
offers a better interface for dealing with known observation noise values.WhiteNoiseKernel
is now hard deprecated (#593)InvMatmul
,InvQuadLogDet
andInvQuad
are now twice differentiable (#603)Likelihood
has been redesigned. See the new documentation for details if you are creating custom likelihoods (#591)- Better support for more flexible Pyro models. You can now define likelihoods of the form
p(y|f, z)
wheref
is a GP andz
are arbitrary latent variables learned by Pyro (#591). - Parameters can now be recursively initialized with full names, e.g.
model.initialize(**{"covar_module.base_kernel.lengthscale": 1., "covar_module.outputscale": 1.})
(#484) - Added
ModelList
andLikelihoodList
for training multiple GPs when batch mode can't be used -- see example notebooks (#471)
Performance and stability improvements
- CG termination is now more tolerance based, and will much more rarely terminate without returning good solves. Furthermore, a warning is raised if it ever does that includes suggested courses of action. (#569)
- In non-ARD mode, RBFKernel and MaternKernel use custom backward implementations for performance (#517)
- Up to a 3x performance improvement in the regime where the test set is very small (#615)
- The noise parameter in
GaussianLikelihood
now has a default lower bound, similar to sklearn (#596) psd_safe_cholesky
now adds successively increasing amounts of jitter rather than only once (#610)- Variational inference initialization now uses
psd_safe_cholesky
rather thantorch.cholesky
to initialize with the prior (#610) - The pivoted Cholesky preconditioner now uses a QR decomposition for its solve rather than the Woodbury formula for speed and stability (#617)
- GPyTorch now uses Cholesky for solves with very small matrices rather than CG, resulting in reduced overhead for that setting (#586)
- Cholesky can additionally be turned on manually for help debugging (#586)
- Kernel distance computations now use
torch.cdist
when on PyTorch 1.1.0 in the non-batch setting (#642) - CUDA unit tests now default to using the least used available GPU when run (#515)
MultiDeviceKernel
is now much faster (#491)
Bug Fixes
- Fixed an issue with variational covariances at test time (#638)
- Fixed an issue where the training covariance wasn't being detached for variance computations, occasionally resulting in backward errors (#566)
- Fixed an issue where
active_dims
in kernels was being applied twice (#576) - Fixes and stability improvements for
MultiDeviceKernel
(#560) - Fixed an issue where
fast_pred_var
was failing for single training inputs (#574) - Fixed an issue when initializing parameter values with non-tensor values (#630)
- Fixed an issue with handling the preconditioner log determinant value for MLL computation (#634)
- Fixed an issue where
prior_dist
was being cached for VI, which was problematic for pyro models (#599) - Fixed a number of issues with
LinearKernel
, including one where the variance could go negative (#584) - Fixed a bug where training inputs couldn't be set with
set_train_data
if they are currentlyNone
(#565) - Fixed a number of bugs in
MultitaskMultivariateNormal
(#545, #553) - Fixed an indexing bug in
batch_symeig
(#547) - Fixed an issue where
MultitaskMultivariateNormal
wasn't interleaving rows correctly (#540)
Other
- GPyTorch is now fully Python 3.6, and we've begun to include static type hints (#581)
- Parameters in GPyTorch no longer have default singleton batch dimensions. For example, the default shape of
lengthscale
is nowtorch.Size([1])
rather thantorch.Size([1, 1])
(#605) setup.py
now includes optional dependents, reads requirements fromrequirements.txt
, does not requiretorch
ifpytorch-nightly
is installed (#495)
Many new features (variational inference, multi GPU, derivative observations), plus JIT
0.2.1
You can install GPyTorch via Anaconda (#463)
Speed and stability
- Kernel distances use the JIT for fast computations (#464)
- LinearCG uses the JIT for fast computations (#464)
- Improve the stability of computing kernel distances (#455)
Features
Variational inference improvements
- Sped up variational models by batching all matrix solves in one call (#454)
- Can use the same set of inducing points for batch variational GPs (#445)
- Whitened variational inference for improved convergence (#493)
- Variational log likelihoods for BernoulliLikelihood are computed with quadrature (#473)
Multi-GPU Gaussian processes
- Can train and test GPs by dividing the kernel onto multiple GPUs (#450)
GPs with derivatives
- Can define RBFKernels for observations and their derivatives (#462)
LazyTensors
- LazyTensors can broadcast matrix multiplication (#459)
- Can use
@
sign for matrix multiplication with LazyTensors
GP-list
- Convenience methods for training/testing multiple GPs in a list (#471)
Other
- Added a
gpytorch.settings.fast_computations
feature to (optionally) use Cholesky-based inference (#456) - Distributions define event shapes (#469)
- Can recursively initialize parameters on GP modules (#484)
Bugs
Batch GPs, fantasy obervations in models, efficient cache updates, various bug and stability fixes (v0.1.1)
v0.1.1
Features
- Batch GPs, which previously were a feature, are now well-documented and much more stable (see docs)
- Can add "fantasy observations" to models.
- Option for exact marginal log likelihood and sampling computations (this is slower, but potentially useful for debugging) (
gpytorch.settings.fast_computations
)
Bug fixes
- Easier usage of batch GPs
- Reduce bugs in additive regression models
Beta Release (v0.1.0)
0.1 release
Improve stability of hyperparameters, more stable variational inference
Stability of hyperparameters
- Hyperparameters taht are constrained to be positive (e.g. variance, lengthscale, etc.) are now parameterized throught the softplus function (
log(1 + e^x)
) rather than through the log function - This dramatically improves the numerical stability and optimization of hyperparameters
- Old models that were trained with
log
parameters will still work, but this is deprecated. - Inference now handles certain numerical floating point round-off errors more gracefully.
Various stability improvements to variational inference
Other changes
GridKernel
can be used for data that lies on a perfect grid.- New preconditioner for LazyTensors.
- Use batched cholesky functions for improved performance (requires updating PyTorch)
Major bug fixes and stability improvements for VI, default derivative for LazyTensor
Pre-release
New features
- Implement diagonal correction for basic variational inference, improving predictive variance estimates. This is on by default.
LazyTensor._quad_form_derivative
now has a default implementation! While custom implementations are likely to still be faster in many cases, this means that it is no longer required to implement a custom_quad_form_derivative
when implementing a newLazyTensor
subclass.
Bug fixes
- Fix a number of critical bugs for the new variational inference.
- Do some hyperparameter tuning for the SV-DKL example notebook, and include fancier NN features like batch normalization.
- Made it more likely that operations internally preserve the ability to perform preconditioning for linear solves and log determinants. This may have a positive impact on model performance in some cases.