Variance Reduced Replica Exchange Stochastic Gradient MCMC

Despite the advantages of gradient variance reduction in near-convex problems, a natural discrepancy between theory and practice is that whether we should avoid the gradient noise in non-convex problems. To fill in the gap, we only focus on the variance reduction of noisy energy estimators to exploit the theoretical accelerations but no longer consider the variance reduction of the noisy gradients so that the empirical experience from stochastic gradient descents with momentum (M-SGD) can be naturally imported.

Requirement

Python 2.7
PyTorch = 1.1 or similar
numpy
CUDA

Please cite our paper (link) if you find it useful in uncertainty estimations

@inproceedings{VR-reSGLD,
  title={Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction},
  author={Wei Deng and Qi Feng and Georgios P. Karagiannis and Guang Lin and Faming Liang},
  booktitle={International Conference on Learning Representations},
  year={2021}
}

Classification: ResNet20 on CIFAR100 with batch size 256

Momentum stochastic gradient descent (M-SGD) with 500 epochs, batch size 256 and decreasing learning rates

$ python bayes_cnn.py -sn 500 -chains 1 -lr 2e-6 -LRanneal 0.984 -T 1e-300  -burn 0.6

Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) with annealing temperatures in warm-up period and fixed temperature afterward

$ python bayes_cnn.py -sn 500 -chains 1 -lr 2e-6 -LRanneal 0.984 -T 0.01 -Tanneal 1.02 -burn 0.6

Standard SGHMC with cylic learning rates and 1000 epochs

$ python bayes_cnn.py -sn 1000 -chains 1 -lr 2e-6 -LRanneal 1.0 -T 0.001 -cycle 5 -period 0 -burn 0.7

Standard Replica Exchange SGHMC (reSGHMC) with annealing temperatures in warm-up period and fixed temperature afterward

$ python bayes_cnn.py -sn 500 -chains 2 -lr 2e-6 -LRanneal 0.984 -T 0.01 -var_reduce 0 -period 2 -bias_F 1.5e5 -burn 0.6

Variance-reduced Replica Exchange SGLD with control variates updated every 2 epochs and fixed temperature after the warm-up period (Algorithm 1)

$ python bayes_cnn.py -sn 500 -chains 2 -lr 2e-6 -LRanneal 0.984 -T 0.01 -var_reduce 1 -period 2 -bias_F 1.5e5 -burn 0.6 -seed 85674

Variance-reduced Replica Exchange SGLD with adaptive control variates and fixed temperature after the warm-up period (Algorithm 2)

$ python bayes_cnn.py -sn 500 -chains 2 -lr 2e-6 -LRanneal 0.984 -T 0.01 -var_reduce 1 -period 2 -bias_F 1.5e5 -burn 0.6 -adapt_c 1

Variance-reduced Replica Exchange SGLD with adaptive control variates and a constant temperature (Algorithm 2)

$ python bayes_cnn.py -sn 500 -chains 2 -lr 2e-6 -LRanneal 0.984 -T 0.0001 -Tanneal 1 -var_reduce 1 -period 2 -bias_F 1.5e7 -burn 0.6 -adapt_c 1

Uncertainty estimation: Test ResNet on CIFAR10 (seen) and SVHN (unseen)

Apply a temperature scaling of 2 for uncertainty calibration

$ python uncertainty_test.py -c VR_reSGHMC -T_scale 2
$ python uncertainty_test.py -c cSGHMC -T_scale 2

References:

M. Welling, Y. Teh. Bayesian Learning via Stochastic Gradient Langevin Dynamics. ICML'11
W. Deng, Q. Feng, L. Gao, F. Liang, G. Lin. Non-convex Learning via Replica Exchange Stochastic Gradient MCMC. ICML'20.
W. Deng, Q. Feng, G. Karagiannis, G. Lin, F. Liang. Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction. ICLR'21.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
models		models
output		output
simulation		simulation
simulation1		simulation1
snapshot_models		snapshot_models
tools		tools
.gitignore		.gitignore
README.md		README.md
bayes_cnn.py		bayes_cnn.py
grid_search.py		grid_search.py
sgmcmc.py		sgmcmc.py
trainer.py		trainer.py
uncertainty_test.py		uncertainty_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Variance Reduced Replica Exchange Stochastic Gradient MCMC

Requirement

Classification: ResNet20 on CIFAR100 with batch size 256

Uncertainty estimation: Test ResNet on CIFAR10 (seen) and SVHN (unseen)

References:

About

Releases

Packages

Contributors 2

Languages

WayneDW/Variance_Reduced_Replica_Exchange_SGMCMC

Folders and files

Latest commit

History

Repository files navigation

Variance Reduced Replica Exchange Stochastic Gradient MCMC

Requirement

Classification: ResNet20 on CIFAR100 with batch size 256

Uncertainty estimation: Test ResNet on CIFAR10 (seen) and SVHN (unseen)

References:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages