Giter Club home page Giter Club logo

polytopesamplermatlab's Introduction

PolytopeSampler

PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes. It is able to sample efficiently from sets and distributions with more than 100K dimensions.

Quick Tutorial

PolytopeSampler samples from distributions of the form exp(-f(x)), for a convex function f, subject to constraints Aineq * x <= bineq, Aeq * x = beq and lb <= x <= ub.

The function f can be specified by arrays containing its first and second derivative or function handles. Only the first derivative is required. By default, f is empty, which represents a uniform distribution. If the first derivative is a function handle, then the function and its second derivatives must also be provided.

To sample N points from a polytope P, you can call sample(P, N). The function sample will

  1. Find an initial feasible point
  2. Run constrained Hamiltonian Monte Carlo
  3. Test convergence of the sampling algorithm by computing Effective Sample Size (ESS) and terminate when ESS >= N. If the target distribution is uniform, a uniformity test will also be performed.

Extra parameters can be set up using opts. Some useful parameters include maxTime and maxStep. By default, they are set to

                        maxTime: 86400 (max sampling time in seconds)
                        maxStep: 300000 (maximum number of steps)

The output is a struct o, which stores samples generated in o.samples and a summary of the sample in o.summary. o.samples is an array of size dim x #steps.

Example

We demonstrate PolytopeSampler using a simple example, sampling uniformly from a simplex. The polytope is defined by

>> P = struct;
>> d = 10;
>> P.Aeq = ones(1, d);
>> P.beq = 1;
>> P.lb = zeros(d, 1);

The polytope has dimension d = 10 with constraint sum_i x_i = 1 and x >= 0. This is a simplex. To generate 200 samples uniformly from the polytope P, we call the function sample().

>> o = sample(P, 200);
  Time spent |  Time reamin |                  Progress | Samples |  AccProb | StepSize |  MixTime
00d:00:00:01 | 00d:00:00:00 | ######################### | 211/200 | 0.989903 | 0.200000 |     11.2
Done!

We can access the samples generated using

>> o.samples

We can print a summary of the samples:

>> o.summary

ans =

  10ร—7 table

                     mean        std         25%         50%         75%      n_ess      r_hat 
                   ________    ________    ________    ________    _______    ______    _______

    samples[1]     0.093187    0.091207    0.026222    0.064326    0.13375    221.51    0.99954
    samples[2]     0.092815    0.086905    0.027018    0.066017    0.13221    234.59     1.0301
    samples[3]      0.10034    0.090834    0.030968    0.075631    0.13788    216.56     1.0159
    samples[4]      0.10531    0.092285    0.035363    0.077519     0.1481    235.25     1.0062
    samples[5]      0.10437    0.087634    0.034946    0.080095     0.1533    212.54    0.99841
    samples[6]       0.1029    0.093724    0.028774    0.074354    0.15135     227.6     1.0052
    samples[7]       0.1042    0.083084    0.038431    0.081964    0.15352    231.54     1.0008
    samples[8]     0.088778    0.086902    0.025565    0.062473    0.11837    229.69     1.0469
    samples[9]      0.10627     0.09074    0.036962    0.084294    0.15125    211.64    0.99856
    samples[10]     0.10184    0.084699    0.035981    0.074923    0.14578    230.63     1.0277

n_ess shows the effective sample size of the samples generated. r_hat tests the convergence of the sampling algorithm. A value of r_hat close to 1 indicates that the algorithm has converged properly.

See demo.m for more examples, including examples of sampling from non-uniform distributions.

polytopesamplermatlab's People

Contributors

constrainedsampler avatar rmtfleming avatar ruoqishen avatar santoshv avatar yintat avatar yunbum-kook avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

polytopesamplermatlab's Issues

thin samples

Need to figure out how to remove the cold section from the samples.

Rhat

Previously, Rhat is supposed to use for one chain. Now the program is sampling multiple chain. So, the Rhat function need to be updated.

recompile the program if the mex file can't run

`Warning: The following error was caught while executing 'MexSolver' class destructor:
Invalid MEX-file
'C:\Users\santo\OneDrive\Documents\GitHub\PolytopeSamplerMatlab\code\solver\CSolver_dd.mexw64':
Missing dependent shared libraries:
'VCRUNTIME140_1.dll' required by
'C:\Users\santo\OneDrive\Documents\GitHub\PolytopeSamplerMatlab\code\solver\CSolver_dd.mexw64'.

Error in MexSolver/delete (line 31)
o.func('delete', o.uid);

Error in MexSolver (line 16)
function o = MexSolver(A, precision)

Error in Solver (line 22)
o = MexSolver(A, precision);

Error in Polytope/remove_dependent_rows (line 368)
solver = Solver(o.A, 'doubledouble');

Error in Polytope/simplify (line 243)
o.remove_dependent_rows();

Error in Polytope (line 195)
o.simplify();

Error in sample (line 42)
polytope = Polytope(problem, opts);

Error in demo (line 9)
o = sample(P, 100); % Number of samples = 100

In MexSolver (line 16)
In Solver (line 22)
In Polytope/remove_dependent_rows (line 368)
In Polytope/simplify (line 243)
In Polytope (line 195)
In sample (line 42)
In demo (line 9)
Invalid MEX-file
'C:\Users\santo\OneDrive\Documents\GitHub\PolytopeSamplerMatlab\code\solver\CSolver_dd.mexw64':
Missing dependent shared libraries:
'VCRUNTIME140_1.dll' required by
'C:\Users\santo\OneDrive\Documents\GitHub\PolytopeSamplerMatlab\code\solver\CSolver_dd.mexw64'.

Error in MexSolver (line 27)
o.uid = o.func('init', uint64(randi(2^32-1,'uint32')), A);

Error in Solver (line 22)
o = MexSolver(A, precision);

Error in Polytope/remove_dependent_rows (line 368)
solver = Solver(o.A, 'doubledouble');

Error in Polytope/simplify (line 243)
o.remove_dependent_rows();

Error in Polytope (line 195)
o.simplify();

Error in sample (line 42)
polytope = Polytope(problem, opts);

Error in demo (line 9)
o = sample(P, 100); % Number of samples = 100`

Better chol and leverage score

There are two things need to be done

  1. run both col and leverage score row by row instead of col by col
  2. Reorder the loop so that it is more cache friendly

Implement Rhat

Implement Rhat and a similar summer function as other solver

DebugLogger

Make DebugLogger really log the algorithm.

warnings

Currently, there is the use of warnings scatter throughout the program.
Maybe move all to the log function.
Make the default logging involves printing out the warning.

Coverage test improvement

  • Make sure the code is updated with the current module system
  • Run all available coverage tests by one script.
  • I can choose how long is the test.
  • I can choose to run one of the item.
  • When the coverage test fails, save the output object for easier debugging. (Probably only save say the first 10 fails.)

Add a coverage test (replace the test folder)

Before further development, we need an automatic way to do a coverage test. I am lazy and won't make a unit test for everything. Instead, we will only have three tests

  • Small tests just for making sure the program works for different
  • Tests for checking the presolve is correct.
  • Tests for sampling.

Some small todo:

  • Remove the recenter in demo and move it to load problem. It probably will only confuse user.

Verify Ax=b

We didn't verify Ax=b in the sampler and the coverage test.

Volumetric Barrier

Note that
tv_ball2 has slow mixing time
tv_ball has fast mixing time.

In general, seems many polytope is small due to the uniform weight of log barrier.
Probably it is better to switch to log barrier

Test on Mac

@ruoqishen

I have changed many things for the C part. So, you should test if it runs now (for the branch newsolver)

Is ess accurate enough?

If I thin the sample according to ess, seems the output is not exactly independent.

initSampler
p_hist = zeros(1000,1);
parfor i = 1:1000
P = struct; dim = 100;
P.Aeq = ones(1, dim);
P.beq = 1;
P.lb = zeros(dim, 1);

o = sample(P, 200); % Number of samples = 100
[pVal] = uniformtest(o, struct('toPlot', true));
p_hist(i) = pVal;
i
end
  • Make a smallest example
  • Add this to coverage test

Documentation

The documentation is a bit repetitive and not simple enough.
Ideally, each test should be just a few lines.
Everything needs to be understandable for the general audience

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.