Giter Club home page Giter Club logo

Comments (8)

segasai avatar segasai commented on August 20, 2024

Hi,

I'll start with this example that shows that the uniform sampler does utilise all the cores if you have a heavy to compute function:

import numpy as np
import pytest
import dynesty
import multiprocessing as mp
import dynesty.pool as dypool


nlive = 1000
printing = True

ndim = 10
gau_s = 0.01


def loglike_gau(x):
    for i in range(1000000):
        1 + 1
    return (-0.5 * np.log(2 * np.pi) * ndim - np.log(gau_s) * ndim -
            0.5 * np.sum((x - 0.5)**2) / gau_s**2)


def prior_transform_gau(x):
    return x



LOGZ_TRUTH_GAU = 0
LOGZ_TRUTH_EGG = 235.856


def test_pool_samplers():
    # this is to test how the samplers are dealing with queue_size>1
    rstate = np.random.default_rng(2)

    with mp.Pool(10) as pool:
        sampler = dynesty.NestedSampler(loglike_gau,
                                        prior_transform_gau,
                                        ndim,
                                        nlive=nlive,
                                        sample='unif',
                                        pool=pool,
                                        queue_size=10,
                                        rstate=rstate,
                                        first_update={
                                            'min_eff': 90,
                                            'min_ncall': 10
                                        })
        sampler.run_nested(print_progress=printing)
        assert (abs(LOGZ_TRUTH_GAU - sampler.results['logz'][-1]) <
                5. * sampler.results['logzerr'][-1])


if __name__ == '__main__':
    test_pool_samplers()

Regarding what's happening in your case. First there is not enough info at the moment to diagnose (one needs a reproduceable example).
But the key difference of uniform sampling is that the proposal of points within ellipsoids is done in a single thread. And only later the likelihoods are being evaluated in parallel.

A possibility for what you're seeing is that the bounding ellipsoid you get for your problem is extreme (i.e. very elongated, much longer than the length of the cube), in that case most of the proposed points within the ellipsoid will be outside the cube and the code will try again etc, etc and that will all be done in a single thread, before the likelihood function is evaluated.

Another possibility is that your function is just too heavy to pickle and that dominates the overheads.

You can test the first hypothesis by lowering the threshold for the warning here

if niter > threshold_warning:

The second hypothesis you can test by using dynesty.pool.Pool which eliminates the pickling overhead.

finally, if you cannot share a reproduceable example, you should at least share the exact way you call dynesty (with all the information, such as nlive, ndim etc. etc ) and ideally all the output from the problematic unif run.

from dynesty.

MikhailBeznogov avatar MikhailBeznogov commented on August 20, 2024

Hello,

Thank you for your help. I will address you questions below.

  1. Here is how I call dynesty with "over-subscription" (ndim=7,npoints=1000):
with Pool() as MP_pool:
    sampler  = dynesty.DynamicNestedSampler(Chi2_Tot,Prior,ndim,bound='multi',
                                            sample='unif', 
                                            first_update={'min_eff':0.50},
                                            update_interval=500.5,
                                            bootstrap=50,enlarge=1.0,
                                            pool = MP_pool,queue_size=1600)
    sampler.run_nested(n_effective=10000,dlogz_init=0.05,nlive_init=npoints,
                       nlive_batch=500)
  1. I do not think that pickling is an issue as dynesty works efficiently in parallel with other samplers (see rslice examples in my opening post). Moreover, emcee and my tests where I evaluate likelihood in parallel using map of multiprocessing pool also work fine and with high parallel efficiency. The former requires enough (>1000) walkers and the latter requires setting chunksize to at least 100 to be really parallel efficient, but I suppose it is due to the fact that the likelihood takes only ~1 ms to evaluate and multiprocessing pool inevitably introduces some overhead.
  2. I think that very elongated ellipsoids might be indeed the cause of the issue. The posterior distributions of some of the parameters have very heavy tails going to the limits of the model parameters' ranges. If this is the case, is there any way to circumvent the issue? I really prefer the unif sampler as it is more robust and does not require to estimate the proper number of walks or slices.

I will send a working example privately.

from dynesty.

segasai avatar segasai commented on August 20, 2024

Hi,

I don't quite agree with your comment number 2. The reason is that when you use rslice or rwalk with say walks=50, it means that each thread will execute at least 50 likelihood calls per one pickle. When using the uniform sampler it'll be always 1 call per 1 pickle, because of that rslice/rwalk will look better in terms of parallelisation.

But certainly option 3 is quite possible.
And I think there are are a few possible improvements in that area. One of which is doing the ellipsoidal sampling in parallel (it will have an overhead of sending over the ellipsoidal bounds), the other one is detecting the crazy ellipsoids and maybe reshaping them somewhat

from dynesty.

segasai avatar segasai commented on August 20, 2024

An update on my side.
I had a bit of time to look into this. And a couple of points

  • If you make your likelihood function computationally heavier, the parallelization works as it should, so it's clearly just an inefficiency that is present for fast likelihood functions.
  • It seems there are some issues (see #427), but maybe other ones that lead to inefficient sampling in some cases. I'm still investigating this.

from dynesty.

MikhailBeznogov avatar MikhailBeznogov commented on August 20, 2024

Hello,

Thanks for the update.

If fast likelihood functions cause the sampler to lose parallel efficiency, perhaps adding an option to vectorize them (i.e., evaluate likelihood at more than one point per call) will help? For example, this is implemented in JohannesBuchner/UltraNest and seems to be efficient if the number of requested points per one call is big enough. If the likelihood does not support vectorization itself, it can be easily "wrapped" by mapping:

def Chi2_Tot_Vect(points):
    return np.array(list(map(Chi2_Tot,points)))

from dynesty.

segasai avatar segasai commented on August 20, 2024

I think the #427 addresses some of the issues of updating bounds, so I am reasonably convinced that now things are functioning as they should be.
But it is clear that in this case

  • the ellipsoidal sampling in your problem is inefficient in the end (I don't have time to investigate that, but I assume that must be related to the shape of the posterior)
  • If the likelihood is very fast the parallelisation of the uniform sampler is not very efficient. That has various solutions, but that's a long-term project.

from dynesty.

segasai avatar segasai commented on August 20, 2024

I'm closing this issue for the time being.
The different parallelization scheme for unif sampler would be still be good, but I don't think there is a bug per se.

from dynesty.

MikhailBeznogov avatar MikhailBeznogov commented on August 20, 2024

Hello,

Sorry for not replying earlier.

Yes, I agree, it is not a bug, just a specific feature of implementation.

Thank you for your help.

from dynesty.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.