Giter Club home page Giter Club logo

pi_ei_under_gaussian_noise_assumption's Introduction

PI and EI under gaussian noise assumption

This repository contains Python code for Bayesian optimization PI, EI and a modification of PI (MPI) and EI (MEI) under gaussian noise assumption at in loss function. The math detailed in Modifications of PI and EI under Gaussian Noise Assumption in Current Optima. This repo has three files:

  • bo_acquis.py: code for Bayesian Optimisation, PI and EI modified from bayesian-optimization, and new code for MPI and MEI.
  • plotters.py : plotter functions for plotting surface for estimated loss and acquisition value in each iteration adapted from bayesian-optimization. that contains the optimization code, and utility functions to plot iterations of the algorithm, respectively.
  • PI_EI_MPI_MEI_Benchmark.ipynb: A tutorial that uses the Bayesian algorithm with the 4 acquisitions to find the global optima on noise corrupted benchmark functions.

The signature of the optimization function is still:

bayesian_optimisation(n_iters, sample_loss, bounds, x0=None, n_pre_samples=5,
                      gp_params=None, random_search=False, alpha=1e-5, epsilon=1e-7)

Background

Probability of improvement(PI) and expected improvement(EI) are calculated with respect to current optima $\tilde{y}$. In some cases, the evaluations on loss function has a gaussian noise $y_i \sim \mathcal{N} (f(\mathbf{x})_i,\sigma^2_y)$. Here we modifie PI and EI under the assumption that all observations including current optima has a noise. They calculate probability of improvement and expected improvement with respect to posterior mean $\mu(\tilde{\mathbf{x}})$ and variance $\kappa(\tilde{\mathbf{x}},\tilde{\mathbf{x}})$ at loss optimum instead. (where $\tilde{\mathbf{x}}$ is parameter setting at current optima.) To lean the gaussian noise in observations, we add a white kernel into the originally adopted GP matern kernel. This enables uncertainty quantification at evaluated locations.

Let $\rho$ denotes $\sqrt{\kappa (\mathbf{x}, \mathbf{x})+ \kappa (\tilde{\mathbf{x}}, \tilde{\mathbf{x}})-2 \kappa (\mathbf{x}, \tilde{\mathbf{x}})}$. Mathematical expression of Modified PI and EI under gaussian noise assumption:

$$ \text{Modified PI: } a_{MPI}(x) = \Phi \left(\frac{\mu(\tilde{\mathbf{x}}) - \mu ( \mathbf{x} ) }{\rho})\right) $$

$$ \text{Modified EI: } a_{MEI} = \Phi(\frac{\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x})}{\rho})(\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x}))+ \phi(\frac{\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x})}{\rho})\rho $$

Current Experiment Result

We test Bayesian Optimisation with 4 acquisition functions at finding the global minima on benchmark functions. PI and EI under GP model with original matern kernel and matern+white kernel are both tested as a control group.

Together with white kernel, MPI shows a better result and more stable performance than PI on most of the benchmark functions against pre-set gaussian noise $\mathcal{N}(\mu=0,\sigma = 10)$, and is believed to be even better when the noise becomes bigger.

Below is the lowest loss we achieved on each benchmark function adding a gaussian noise $\mathcal{N}(\mu=0,\sigma = 10)$. Bayesian Optimisation parameter-setting is : iter = 45, random_search=10000. The result is averaged throughout 30 repeated trails, in (mean±std). All result at Cloud Drive.

acquisition functions six-hump rastrigin goldstein rotated-hyper-ellipsoid sphere
MPI, kernel=matern+white -21.58±5.30 -10.20±6.10 10.77±5.85 -21.54±5.40 -15.28±6.77
MEI, kernel=matern+white -20.34±4.04 -10.98±4.64 24.20±8.47 -18.11±4.97 -15.65±5.29
PI, kernel=matern+white -14.96±5.34 -3.34±9.15 28.83±18.46 14.70±76.71 -12.20±5.68
EI, kernel=matern+white -16.56±5.82 -4.60±8.27 23.60±7.19 -18.84±5.61 -14.68±5.36
PI, kernel=matern -21.75±5.28 -6.39±6.85 13.16±6.09 -16.33±4.51 -15.14±5.83
EI, kernel=matern -20.57±4.74 -8.29±7.45 22.9±47.27 -18.43±5.15 -13.52±6.05

Perform Bayesian Optimisation on rastrigin function with PI (kernel=matern) and MPI (kernel = matern+white); probability of improvement and loss surface in each iteration is plotted. Here MPI performs more like PI on noise-less loss surface, which focus to exploit at one point, whereas PI is disturbed by noise and lost its focus.

Rastrigin Surface PI Searching Trajectory MPI Searching Trajectory
Alt Text Alt Text Alt Text

pi_ei_under_gaussian_noise_assumption's People

Contributors

huabingwang-stack avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.