Skip to content

Bayesian Optimisation acquisition functions PI and EI modified under guassian noise assuption at observations

Notifications You must be signed in to change notification settings

HuabingWang-stack/PI_EI_Under_Gaussian_Noise_Assumption

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PI and EI under gaussian noise assumption

This repository contains Python code for Bayesian optimization PI, EI and a modification of PI (MPI) and EI (MEI) under gaussian noise assumption at in loss function. The math detailed in Modifications of PI and EI under Gaussian Noise Assumption in Current Optima. This repo has three files:

  • bo_acquis.py: code for Bayesian Optimisation, PI and EI modified from bayesian-optimization, and new code for MPI and MEI.
  • plotters.py : plotter functions for plotting surface for estimated loss and acquisition value in each iteration adapted from bayesian-optimization. that contains the optimization code, and utility functions to plot iterations of the algorithm, respectively.
  • PI_EI_MPI_MEI_Benchmark.ipynb: A tutorial that uses the Bayesian algorithm with the 4 acquisitions to find the global optima on noise corrupted benchmark functions.

The signature of the optimization function is still:

bayesian_optimisation(n_iters, sample_loss, bounds, x0=None, n_pre_samples=5,
                      gp_params=None, random_search=False, alpha=1e-5, epsilon=1e-7)

Background

Probability of improvement(PI) and expected improvement(EI) are calculated with respect to current optima $\tilde{y}$. In some cases, the evaluations on loss function has a gaussian noise $y_i \sim \mathcal{N} (f(\mathbf{x})_i,\sigma^2_y)$. Here we modifie PI and EI under the assumption that all observations including current optima has a noise. They calculate probability of improvement and expected improvement with respect to posterior mean $\mu(\tilde{\mathbf{x}})$ and variance $\kappa(\tilde{\mathbf{x}},\tilde{\mathbf{x}})$ at loss optimum instead. (where $\tilde{\mathbf{x}}$ is parameter setting at current optima.) To lean the gaussian noise in observations, we add a white kernel into the originally adopted GP matern kernel. This enables uncertainty quantification at evaluated locations.

Let $\rho$ denotes $\sqrt{\kappa (\mathbf{x}, \mathbf{x})+ \kappa (\tilde{\mathbf{x}}, \tilde{\mathbf{x}})-2 \kappa (\mathbf{x}, \tilde{\mathbf{x}})}$. Mathematical expression of Modified PI and EI under gaussian noise assumption:

$$ \text{Modified PI: } a_{MPI}(x) = \Phi \left(\frac{\mu(\tilde{\mathbf{x}}) - \mu ( \mathbf{x} ) }{\rho})\right) $$

$$ \text{Modified EI: } a_{MEI} = \Phi(\frac{\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x})}{\rho})(\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x}))+ \phi(\frac{\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x})}{\rho})\rho $$

Current Experiment Result

We test Bayesian Optimisation with 4 acquisition functions at finding the global minima on benchmark functions. PI and EI under GP model with original matern kernel and matern+white kernel are both tested as a control group.

Together with white kernel, MPI shows a better result and more stable performance than PI on most of the benchmark functions against pre-set gaussian noise $\mathcal{N}(\mu=0,\sigma = 10)$, and is believed to be even better when the noise becomes bigger.

Below is the lowest loss we achieved on each benchmark function adding a gaussian noise $\mathcal{N}(\mu=0,\sigma = 10)$. Bayesian Optimisation parameter-setting is : iter = 45, random_search=10000. The result is averaged throughout 30 repeated trails, in (mean±std). All result at Cloud Drive.

acquisition functions six-hump rastrigin goldstein rotated-hyper-ellipsoid sphere
MPI, kernel=matern+white -21.58±5.30 -10.20±6.10 10.77±5.85 -21.54±5.40 -15.28±6.77
MEI, kernel=matern+white -20.34±4.04 -10.98±4.64 24.20±8.47 -18.11±4.97 -15.65±5.29
PI, kernel=matern+white -14.96±5.34 -3.34±9.15 28.83±18.46 14.70±76.71 -12.20±5.68
EI, kernel=matern+white -16.56±5.82 -4.60±8.27 23.60±7.19 -18.84±5.61 -14.68±5.36
PI, kernel=matern -21.75±5.28 -6.39±6.85 13.16±6.09 -16.33±4.51 -15.14±5.83
EI, kernel=matern -20.57±4.74 -8.29±7.45 22.9±47.27 -18.43±5.15 -13.52±6.05

Perform Bayesian Optimisation on rastrigin function with PI (kernel=matern) and MPI (kernel = matern+white); probability of improvement and loss surface in each iteration is plotted. Here MPI performs more like PI on noise-less loss surface, which focus to exploit at one point, whereas PI is disturbed by noise and lost its focus.

Rastrigin Surface PI Searching Trajectory MPI Searching Trajectory
Alt Text Alt Text Alt Text

About

Bayesian Optimisation acquisition functions PI and EI modified under guassian noise assuption at observations

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published