lollcat / fab-torch Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 6.0 272.5 MB

Flow Annealed Importance Sampling Bootstrap (FAB). ICLR 2023.

License: MIT License

Python 86.16% Jupyter Notebook 13.84%

annealed-importance-sampling boltzmann-distribution boltzmann-generator normalizing-flow

fab-torch's People

Contributors

Stargazers

Watchers

Forkers

pkulwj1994 timsey henrik-schopmans austin-hoover jarridrb

fab-torch's Issues

feat: re-run GMM and Many Well experiments to make sure they are working nicely

This #75 PR has made changes that may effect the results of these experiments.
I re-ran the notebooks to make sure they were working nicely.
Additionally the full experiments should be re-run to make sure they are working nicely.

Replicate results on GMM and 16 dim Many Well

Replicate the results from the "Bootstrap Your Flow" paper, and add clear example notebooks for these.

Prioritised replay improvements

Use more ideas from the prioritised replay paper: https://arxiv.org/pdf/1511.05952.pdf

Replay data on different device
use sum tree data structure
Include temperature parameter

Cleaning

Add linting, typing and documentation for all functions

Add defensive importance sampling

Use a mixture distribution with a flow, and a "defensive distribution".
The defensive distribution should help with the regions where the flow misses important parts of the target (initially during training) or if the tail of the flow is overly light in certain regions.

Ability to pick alpha in fab alpha divergence loss

GMM improvements

The following improvements can be made to the GMM problem

Set the metropolis step size constant (initialised to 1). Currently this is being tuned with a low target_p_accept which causes the step size to become quite big, meaning a lot of points from AIS are not useful for updating the flow. We could set target_p_accept to something more reasonable (e.g. 0.65), but fixing it to the constant initialised value simplifies things nicely, and we should still get good performance. This also is good for the comparison to SNFs, as our current implementation of SNFs has no step size tuning - so using a fixed step size would allow us to have identical MCMC transition kernels for FAB vs SNFs.
Look into numerical instability in the buffer: Currently we sometimes get numerical instability where a large importance weight causes other importance weights to go to zero. In the other problems this isn't much of an issue, but in the GMM problem close to initialisation the flow often places very low probability on some of the modes, causing high importance weights.
The metropolis mcmc transition kernel does not have an option to turn off step size tuning for all of training, as set_eval_mode=False results in the step size being tuned even if adjust_step_size was initially set to False.
The evaluation scripts can also be made to be cleaner, and run by default at the end of training.

fix: ais geometric spacing

Fix bug in geometric spacing for ais. This doesn't currently effect any of the experiments, as they use linear spacing.

Make runnable by Vincent's resampled-base-flows lib

See here for the structure that we are aiming to be able fit easily into.
Following this, Vincent can plug this module into his code to run the Aladine Dipeptide example.

Initial architecture setup

Setup basic repository structure.
Will add more detailed notes to this issue as I go along.

feat: many well evaluation

For Many Well problem allow evaluation of samples and log weights, without having to specify log_prob_fn

Add HMC mass matrix

Add option to specify the mass matrix, instead of always assuming unit mass.
Clean up HMC code, making it more modular
Add an option for evaluation mode where none HMC's parameters are tuned.
Add example notebook visualising AIS, including the effect of the number of distributions on the sample size.

Get aladine dipeptide example working with FAB

Benchmarking

Currently there are many decisions that can be made in the algorithm - these should be benchmarked in simple tests to get a good idea of their effects. This includes:

Testing various versions of the loss

Standard FAB loss: bootstrap estimate of alpha divergence lower bound with alpha=2
Could also estimate forward KL divergence, and this will have an equation with a simpler form.

Use exponential moving average of normalisation constant
When calculating fab-loss, we can (1) use the unnormalise log weights returned by AIS, as this will still give an expectation proportional to alpha divergence with alpha=2, or (2) we can normalise using the current batch of weights, or (3) we could use an exponential moving average of the normalisation constant, calculated during the training process.

this also applies to the log_w of the flow model (and not just the log_w of AIS)

Currently we are doing (2), but it may be better to do (3), and it is worth comparing the performance of all three approaches.

Testing various transition operators

Metropolis
Hamiltonian Monte Carlo (with various settings for tuning the step size)

Performance bottlenecks
Which parts of FAB are the slowest - can we add JIT to speed these up?

add val.pt to experiments/aldp/data

I would like to play around with the experiments for the alanine dipeptide.
In order to use the train.py file in experiments/aldp, I need the val.pt file that is used for evaluate_aldp.

It would be really helpful if this data could also be provided, so that I can run the code on my local machine

BNNs

Create BNN target problem

Implement more losses

Add the following alternative losses:

Maximise log prob of AIS samples
Importance weighted estimate of forward KL with AIS samples

Run tests on double well / many well & GMM problems.

Code efficiency improvements

Allow transition operator to be an optional argument
Re-use flow and target evaluations (and \nabla_x log prob(x)) in AIS
Clean up config files to make running experiments easier

Implement random shuffling for mini-batches

For the buffer we can shuffle samples between epochs over the minibatches.

Add Boltzmann distribution to target distributions

Create sample based test set for many well problem.

Currently, have manually placed points on distributions modes. Additionally we can create a 2D test-set via MCMC and then sample from this for pairs of dimensions for higher dimensional many well problems to get samples approximately from p(x).

Jupyter/Colab Notebook for Alanine Dipeptide

Simplified version of the alanine dipeptide training script as an easy to use jupyter notebook that can run on Google Colab
Should produce some meaningful results after ~20min-2h of training

feat: Add valid samples abstraction

Make it easy for user to enter criterion for samples to be valid.
Automatically filter invalid samples from the buffer.
By default this can be if the samples are out of a bounds, or the target/flow log prob is infinate/nan.
Currently this is done in the buffer direclty, but it would be better to expose the option to the user so that they can control it, and so that it's effects are clear.
For alanine dipeptide this could be used to optionally filter based on chirality.

feat: clean configs

Configs for dw4 and many well should be made cleaner.

Comments explaining hyper-parameter choices.
Better structuring, e.g. specifiying buffer length in terms of number of batches.
Remove un-used arguments (e.g. remove separate options for use_buffer and prioritised_buffer)

Improve sample efficiency

Look into using replay memory (saving samples, log weights and target log probs) in a large data-structure and re-using them), and/or PPO style re-use of samples - as using samples once per the current approach seems super inefficient.