Practical Bayesian Optimization of Machine Learning Algorithms

Jasper Snoek, Hugo Larochelle, Ryan P. Adams. Practical Bayesian Optimization of Machine Learning Algorithms. NIPS 2012.

tl;dr

Authors propose a fully Bayesian method for Gaussian Process hyperparameters
Empirical results on LR, online LDA, SVMs, and CNNs
Also covers the unkown cost or availability of multiple cores to run in parallel.

Contributions

Considerations for Bayesian Optimization of Hyperparameters include: 1) covariance function selection and hyperparamethers, 2) function evaluation takes resources, 3) opitmization should take advantage of multi-core parallelism.

Using a ARD Matern 5/2 kernel instead of the automatic releevance determination (ARD) squared exponential kernel gives a function that is not too perfectly smooth. Furthermore, we can marginalizae over hyperparameters to compute the integrated acquisition function.
We should optimize for improvement per second rather than improvement per iteration. We can also employ Gaussian processes to model cost functions.
Sequential strategy using MC estimates of the acquisition function.

Experiments

Brainin-Hoo function (common Bayesian benchmark) and Tree-Parzen Algorithm on MNIST data using logistic regression.
Online LDA involves two learning parameters to control learning rate to update variational parameters of LDA.
Motif finding with Structure SVMs, which outperform SVMs when they can explicitly model problem-dependent hidden variables. Protein DNA sequences model the unknown location of praticular subsequences or motifs.
CNNs on CIFAR-10 allow us to tune nine hyperparameters of a three-layer CNN.

Q's for authors

TODO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SnoLarAda12.md

SnoLarAda12.md

Practical Bayesian Optimization of Machine Learning Algorithms

tl;dr

Contributions

Experiments

Q's for authors

Files

SnoLarAda12.md

Latest commit

History

SnoLarAda12.md

File metadata and controls

Practical Bayesian Optimization of Machine Learning Algorithms

tl;dr

Contributions

Experiments

Q's for authors