dragonfly / dragonfly Goto Github PK

An open source python library for scalable Bayesian optimisation.

License: MIT License

Python 90.19% Fortran 9.35% Shell 0.13% PureBasic 0.33%

dragonfly's Introduction

Dragonfly is an open source python library for scalable Bayesian optimisation.

Bayesian optimisation is used for optimising black-box functions whose evaluations are usually expensive. Beyond vanilla optimisation techniques, Dragonfly provides an array of tools to scale up Bayesian optimisation to expensive large scale problems. These include features/functionality that are especially suited for high dimensional optimisation (optimising for a large number of variables), parallel evaluations in synchronous or asynchronous settings (conducting multiple evaluations in parallel), multi-fidelity optimisation (using cheap approximations to speed up the optimisation process), and multi-objective optimisation (optimising multiple functions simultaneously).

Dragonfly is compatible with Python2 (>= 2.7) and Python3 (>= 3.5) and has been tested on Linux, macOS, and Windows platforms. For documentation, installation, and a getting started guide, see our readthedocs page. For more details, see our paper.

Installation

See here for detailed instructions on installing Dragonfly and its dependencies.

Quick Installation: If you have done this kind of thing before, you should be able to install Dragonfly via pip.

$ sudo apt-get install python-dev python3-dev gfortran # On Ubuntu/Debian
$ pip install numpy
$ pip install dragonfly-opt -v

Testing the Installation: You can import Dragonfly in python to test if it was installed properly. If you have installed via source, make sure that you move to a different directory to avoid naming conflicts.

$ python
>>> from dragonfly import minimise_function
>>> # The first argument below is the function, the second is the domain, and the third is the budget.
>>> min_val, min_pt, history = minimise_function(lambda x: x ** 4 - x**2 + 0.1 * x, [[-10, 10]], 10);  
...
>>> min_val, min_pt
(-0.32122746026750953, array([-0.7129672]))

Due to stochasticity in the algorithms, the above values for min_val, min_pt may be different. If you run it for longer (e.g. min_val, min_pt, history = minimise_function(lambda x: x ** 4 - x**2 + 0.1 * x, [[-10, 10]], 100)), you should get more consistent values for the minimum.

If the installation fails or if there are warning messages, see detailed instructions here.

Quick Start

Dragonfly can be used directly in the command line by calling dragonfly-script.py or be imported in python code via the maximise_function function in the main library or in ask-tell mode. To help get started, we have provided some examples in the examples directory. See our readthedocs getting started pages (command line, Python, Ask-Tell) for examples and use cases.

Command line: Below is an example usage in the command line.

$ cd examples
$ dragonfly-script.py --config synthetic/branin/config.json --options options_files/options_example.txt

In Python code: The main APIs for Dragonfly are defined in dragonfly/apis. For their definitions and arguments, see dragonfly/apis/opt.py and dragonfly/apis/moo.py. You can import the main API in python code via,

from dragonfly import minimise_function, maximise_function
func = lambda x: x ** 4 - x**2 + 0.1 * x
domain = [[-10, 10]]
max_capital = 100
min_val, min_pt, history = minimise_function(func, domain, max_capital)
print(min_val, min_pt)
max_val, max_pt, history = maximise_function(lambda x: -func(x), domain, max_capital)
print(max_val, max_pt)

Here, func is the function to be maximised, domain is the domain over which func is to be optimised, and max_capital is the capital available for optimisation. The domain can be specified via a JSON file or in code. See here, here, here, here, here, here, here, here, here, here, and here for more detailed examples.

In Ask-Tell Mode: Ask-tell mode provides you more control over your experiments where you can supply past results to our API in order to obtain a recommendation. See the following example for more details.

For a comprehensive list of uses cases, including multi-objective optimisation, multi-fidelity optimisation, neural architecture search, and other optimisation methods (besides Bayesian optimisation), see our readthe docs pages (command line, Python, Ask-Tell)).

Contributors

Kirthevasan Kandasamy: github, webpage
Karun Raju Vysyaraju: github, linkedin
Anthony Yu: github, linkedin
Willie Neiswanger: github, webpage
Biswajit Paria: github, webpage
Chris Collins: github, webpage

Acknowledgements

Research and development of the methods in this package were funded by DOE grant DESC0011114, NSF grant IIS1563887, the DARPA D3M program, and AFRL.

Citation

If you use any part of this code in your work, please cite our JMLR paper.

@article{JMLR:v21:18-223,
  author  = {Kirthevasan Kandasamy and Karun Raju Vysyaraju and Willie Neiswanger and Biswajit Paria and Christopher R. Collins and Jeff Schneider and Barnabas Poczos and Eric P. Xing},
  title   = {Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {81},
  pages   = {1-27},
  url     = {http://jmlr.org/papers/v21/18-223.html}
}

License

This software is released under the MIT license. For more details, please refer LICENSE.txt.

For questions, please email [email protected].

dragonfly's People

Contributors

Stargazers

Watchers

Forkers

remilehe markusmichaelrau mkhoin crcollins bapoczos qboticslabs merlinwu ml-lab chsafouane pyadmell esmaeilinia shankar0206 prashant-lbsim celsius38 phanigenin chengyuegongr trendingtechnology jangocity jangocheng batermj cclauss udemirezen kirthevasank legendtianjin sadjadasghari chrinide sg47 liuwq168 vivian7755 pombredanne jason9263 beamjl tbsilence lijianwen96 todun calvinmccarter-at-tempus joshuagithub anthonyhsyu jmren168 richardliaw zizai happywhy rsantana-isg devotionzhu wuzhiguocarter jimmy-inl hase1128 kibromhft cryptokr satishwarkedas danielgutmann bipulroy8 eliberis walter153 h-rummukainen iury-amorim anushkrishnav zualemo-xo alvarovm gyczero cathxxyu vincentcheny hyunghunny jie0705 angelfp maneeshs geofiber qingshanyinyin tjankovic standardgalactic mind-lab kevin3062 zombig accsky rishirelan ustbcmspjy automatsolu mukhtarbayerouniversity garitrik daisy-ycguo swaraj-amazon aroopkumarorg dummy-org-1 dummy-2-io dummy-3-2 dummy-3-1 aws-bugbust-01 aws-bugbust-02 aws-bugbust-03 aws-bugbust-04 aws-bugbust-05 aws-bugbust-06 aws-bugbust-07 aws-bugbust-08 aws-bugbust-09 aws-bugbust-10 aws-bugbust-11 aws-bugbust-12 aws-bugbust-13 aws-bugbust-14

dragonfly's Issues

Multiobjective optimization in ask-tell interface

Hello, I am wondering if multiobjective optimization is currently implemented in the ask-tell interface?
If it is available, could you please hint changes needed in the code of the single-objective ask-tell example?

Some Options Read in As Strings

When options such as init_capital (and I assume others) are read in from the command line, they are recognized as strings instead of numbers. This is most likely because the default value is None so argparse thinks the option should be a string.

Configuration for discrete strings

domain_config = load_config({
    "domain": [
        {
            "name": "update_rule",
            "type": "discrete",
            "dim": 3,
            "items": "hebbian-anti_hebbian-random_walk"
        },
        {
            "name": "K",
            "type": "int",
            "min": 4,
            "max": 8,
            # "dim": 1
        },
        {
            "name": "N",
            "type": "int",
            "min": 4,
            "max": 8,
            # "dim": 1
        },
        {
            "name": "L",
            "type": "int",
            "min": 4,
            "max": 8,
            # "dim": 1
        }
    ]
})
func_caller = CPFunctionCaller(
    None, domain_config.domain.list_of_domains[0])
optimizer = CPGPBandit(func_caller, ask_tell_mode=True)

results in the following error:

  File "cli.py", line 315, in hparams
    optimizer = CPGPBandit(func_caller, ask_tell_mode=True)
  File ".../site-packages/dragonfly/opt/gp_bandit.py", line 769, in __init__
    ask_tell_mode=ask_tell_mode)
  File ".../site-packages/dragonfly/opt/gp_bandit.py", line 187, in __init__
    ask_tell_mode=ask_tell_mode)
  File ".../site-packages/dragonfly/opt/blackbox_optimiser.py", line 41, in __init__
    options, reporter, ask_tell_mode)
  File ".../site-packages/dragonfly/exd/exd_core.py", line 96, in __init__
    self._set_up()
  File ".../site-packages/dragonfly/exd/exd_core.py", line 144, in _set_up
    self._exd_child_set_up()
  File ".../site-packages/dragonfly/opt/blackbox_optimiser.py", line 48, in _exd_child_set_up
    self._opt_method_set_up()
  File ".../site-packages/dragonfly/opt/gp_bandit.py", line 203, in _opt_method_set_up
    self._set_up_acq_opt()
  File ".../site-packages/dragonfly/opt/gp_bandit.py", line 284, in _set_up_acq_opt
    self._domain_specific_acq_opt_set_up()
  File ".../site-packages/dragonfly/opt/gp_bandit.py", line 819, in _domain_specific_acq_opt_set_up
    if self.acq_opt_method.lower() in ['direct']:
AttributeError: 'NoneType' object has no attribute 'lower'

How can I create a configuration that uses discrete string and 3 discrete integers? For example, a valid configuration for a single run would be ("hebbian", 6, 8, 7).

Same hyperparameter configuration proposed multiple times

Hello!
I've tried to perform Bayesian optimization on two float hyperparameters of an algorithm. I've noticed that the same configuration of hyperparameters has been proposed four times. Since during the optimization I've already evaluated that hyperparameter configuration and the algorithm might be expensive to evaluate, I would prefer not to re-evaluate the same exact configuration (even if the function is noisy). Why would the BO algorithm propose an already evaluated configuration?

Moreover, considering a single hyperparameter, dragonfly proposed the same value several times. Is this behavior expected? I thought that, being a hyperparameter in a real space, it must be very unlikely for a specific value to be proposed several times. For example, the value -2.4375 has been proposed (and thus evaluated) multiple times for both the hyperparameters.

I attach the code and output below. Thanks :)

domain = [[-4, 1], [-4, 1]]
max_capital = 30
func_caller = EuclideanFunctionCaller(None, domain)
opt = gp_bandit.EuclideanGPBandit(func_caller, ask_tell_mode=True)
opt.initialise()

best_x, best_y = None, float('-inf')
for _ in range(max_capital):
    x = opt.ask()
    y = objective(x)
    opt.tell([(x, y)])
    print('x: %s, y: %s' % (x, y))
    if y > best_y:
        best_x, best_y = x, y

Output (same configurations in bold):

x: [-1.83249193 -2.70616307], y: -0.12166496409071688
x: [0.10388933 0.68413865], y: -0.10050714794839039
x: [-0.84051921 -0.89712971], y: -0.092327749261174
x: [-2.15919806 -1.96698749], y: -0.08681995372508136
x: [-3.5505209 -3.43387401], y: -0.09681205413872439
x: [ 0.67781299 -3.82671431], y: -0.10915992425239594
x: [-2.125 -1.8125], y: -0.09220882544663501
x: [-2.4375 -2.4375], y: -0.08254446522619925
x: [-2.4375 -1.8125], y: -0.10042781180939035
x: [-2.4375 -2.4375], y: -0.08067308981463231
x: [-2.125 -2.125], y: -0.09679552201302356
x: [-0.5625 -1.1875], y: -0.11277140983127563
x: [-0.875 -0.5625], y: -0.0927984315298233
x: [-2.4375 -2.4375], y: -0.09163390693191012
x: [-0.5625 0.0625], y: -0.10788750373881312
x: [-0.5625 -0.5625], y: -0.13080243859682508
x: [-3.6875 -0.5625], y: -0.10579842491465447
x: [-3.6875 -0.5625], y: -0.09142648384218793
x: [-2.984375 0.4140625], y: -0.11603514510133543
x: [ 0.6875 -3.0625], y: -0.12286569420847843
x: [ 0.6875 -2.4375], y: -0.12817945757533578
x: [-2.4375 -3.21875], y: -0.1098404650344026
x: [-3.0625 -0.5625], y: -0.1311725186543427
x: [-2.4375 -2.4375], y: -0.03886725461684583
x: [-1.1875 -2.4375], y: -0.13435488316472557
x: [-2.125 0.6875], y: -0.08744121017349624
x: [-3.6875 -3.6875], y: -0.08649844613275853
x: [0.6875 0.6875], y: -0.1305963555309291
x: [-3.375 -2.4375], y: -0.09337741276060564
x: [-1.8125 -2.4375], y: -0.09226863728914639

GAOptimiser for Euclidean spaces

Hello,
The documentation for the Ask-Tell Mode states: In this interface, the optimizer is explicitly created via <domain>GPBandit, <domain>GAOptimiser, or <domain>RandomOptimiser, where <domain> is replaced by Euclidean or CP depending on the domain used..
I wanted to use GA with Euclidean domain, but didn't find any class that support this functionality (i.e. no EuclideanGAOptimiser-class was found)

I want to use a domain other than Euclidean

For instance, a Discrete Numeric Domain.
I see them in domains.py, but every example I find just uses the simple bracket notation that implicitly converts to Euclidean.

Too much time spent in isolated iteration

I was measuring the time spent in each iteration of minimise_function and I've noticed that in my laptop it takes less than a second most of the time, but sometimes an iteration
takes an enormous amount of time. For example this:

myfunction_spent_time: 3
init dragonfly after computing the function:
Thu Apr 25 18:18:35 EDT 2019
final dragonfly after computing the function:
Thu Apr 25 18:27:19 EDT 2019

Almost 10 minutes. This is weird. Any idea where the time is being consumed?
Is there any way to limit the execution time to some maximum amount of time?

installing dragonfly under pypy fails

I've tried to accelerate the dragonfly execution by using pypy, which is a python environment way faster than traditional python.

The problem it has is compatibility and that is precise the case with dragonfly.

Any idea if dragonfly will be supported under pypy at any time?

Does this work for neural net that takes weeks to train?

Hi I read your whitepaper and it seems that the algorithms used here mostly require sampling of the hyperparameter space and then based on the training results, create a model of the hyperparameter space and cleverly select subsequent parameters that are more likely to result in a better result.

However, if my model is already taking weeks to train, sampling is going to take super long? Am I wrong? How should I use this effectively with a large deep learning model?

Errors encountered when using Dragonfly #3

This is a non abortive error, but it is an error anyway:

/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py:368: RuntimeWarning: divide by zero encountered in true_divide
exp_probs = np.exp((fitness_vals - mean_param)/scaling_param)
/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py:369: RuntimeWarning: invalid value encountered in true_divide
return exp_probs/exp_probs.sum()

Errors encountered while using Dragonfly #4

Another non abortive error:

ip_mroute.c copying corrupt packet to userspace (IGMPMSG_NOCACHE)

I have a bad feeling about ip_mroute.c corrupting a packet when it's copying from kernel to userspace, inside of IGMPMSG_NOCACHE upcall.

I am currently writing an IGMP daemon and porting that to Dragonfly. Code below works fine on linux / free/open and netbsd.

Somehow my daemon is receving an incorrect destination IP adress in the copied packet. Considering following incoming packet and debug log snippet. Note that only the last tuple (250) seems to be correct.
13:55:54.286252 IP 192.168.1.105.3423 > 239.255.255.250.3423: UDP, length 54
13:55:54:2863 Route activation request for group: 1.0.1.250 from src: 192.168.1.105 not valid. Ignoring

Code for receiving packet from kernel:
struct iovec ioVec[1] = { { recv_buf, BUF_SIZE } };
struct msghdr msgHdr = (struct msghdr){ NULL, 0, ioVec, 1, &cmsgUn, sizeof(cmsgUn), MSG_DONTWAIT };
int recvlen = recvmsg(pollFD[0].fd, &msgHdr, 0);
acceptIgmp(recvlen, msgHdr);

Code to process packet:
void acceptIgmp(int recvlen, struct msghdr msgHdr) {
struct igmpmsg *igmpMsg = (struct igmpmsg *)(recv_buf);
struct ip *ip = (struct ip *)recv_buf;
register uint32_t src = ip->ip_src.s_addr, dst = ip->ip_dst.s_addr, group;

switch (igmpMsg->im_msgtype) {
case IGMPMSG_NOCACHE:
for (i=0;i<recvlen;i++) sprintf(bla+i*5,"0x%02hhx:",recv_buf[i]);
my_log(LOG_DEBUG,0,"BUFFER: %s",bla);

This code then proceeds to output the buffer:
13:55:54:2863 BUFFER: 0x45:0x00:0x52:0x00:0xda:0xed:0x00:0x40:0x04:0x11:0xe9:0xa1:0xc0:0xa8:0x01:0x69:0x01:0x00:0x01:0xfa:
The data in this buffer is a correct IP header, except for the destination

Source in packet copied kernel: 0xc0:0xa8:0x01:0x69 (192.168.1.105)
Destination: 0x01:0x00:0x01:0xfa (1.0.1.250)
Destination should be: 0xef:0xff:0xff:0xfa (239.255.255.250)

It may be me, but I cannot seem to find any documentation pertaining to the multicast routing api on dragonfly.

Limits on parameters, metrics, etc.

Hi. I am interested to explore dragonfly for BO and wanted to know if there are any particular hard or advised limits on the number of parameters, metrics, parameter constraints, metric constraints, etc in the functionality.

hierarchical definition of domains

how could I define non-parametric or hierarchical domains?.

A hierarchical example, you can choose between SGD and ADAM, if you choose the first one, you would have to specify learning rate and momentum. If you choose the second one you would have to specify gamma and beta on top of the learning rate. There's also clipping values if you have some clipping optimizer.

As a non-parametric example, one could choose to add a new layer, which could be a fully connected, batch_norm or dropout. Depending on the type of layer there are several parameters to be specified (eg. dropout rate or number of units). Thing is that the number of layers is potentially infinite.

Do you think I should opt for different optmization algos?. Something like evolutionary algos or something else?

How to use `DiscreteEuclideanDomain` in Bayesian optimization

Hi,

I recently tried to use the DiscreteEuclideanDomain in a Bayesian optimization (in ask-tell mode), but I am not sure how to combine it with the other objects in dragonfly. Here is how I initialized things:

    domain = EuclideanDomain([[lo, up] for lo, up in zip(lb_list, ub_list)])
    fidel_space = DiscreteEuclideanDomain([fidel_range])
    func_caller = EuclideanFunctionCaller(None,
                                          raw_domain=domain,
                                          raw_fidel_space=fidel_space,
                                          fidel_cost_func=cost_func,
                                          raw_fidel_to_opt=fidel_range[-1])
    opt = EuclideanGPBandit(func_caller,
                            ask_tell_mode=True,
                            is_mf=True)

However, I get an error at the point where the EuclideanFunctionCaller is created:

experiment_caller.py", line 428, in get_normalised_fidel_coords
    return map_to_cube(Z, self.raw_fidel_space.bounds)
AttributeError: 'DiscreteEuclideanDomain' object has no attribute 'bounds'

Should I use a different FunctionCaller in this case?

Strange distance computation for DiscreteDomain and ProdDiscreteDomain

For all domains method compute_distance behaves as I'd expect, except for DiscreteDomain:
https://github.com/dragonfly/dragonfly/blob/master/dragonfly/exd/domains.py#L177

and ProdDiscreteDomain:
https://github.com/dragonfly/dragonfly/blob/master/dragonfly/exd/domains.py#L290

where it returns 1 if the elements are different and 0 if they are the same. Is that intended?

from dragonfly.exd.domains import DiscreteNumericDomain, DiscreteDomain, CartesianProductDomain

d1 = DiscreteNumericDomain([0.0, 1.0])
d2 = DiscreteDomain(['same', 'different'])
d = CartesianProductDomain([d1, d2])

print(d.compute_distance([0.0, 'same'], [0.0, 'same']))        # outputs 1.0, would expect 0.0
print(d.compute_distance([0.0, 'same'], [0.0, 'different']))   # outputs 0.0, would expect 1.0
print(d.compute_distance([0.0, 'same'], [1.0, 'different']))   # outputs 1.0, would expect 2.0

Initial guess

Hello,

First, thank you for dragonfly 👍 It really is great !

Is there currently a way of starting the evaluation with an initial guess, or feeding a few initial guesses to be evaluated during the very first round of evaluation of an objective function (here an initial guess = a set of parameters that we know from domain expertise would yield correct performances with the given objective function) ?

Otherwise, would you have any clues on how to implement this ?

What is the version of tensorflow is used in neural architecture search?

What is the version of tensorflow is used in neural architecture search?
I'm using tensorflow 1.14.0 and running code demo_nas.py encountered the following error:

  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/exd/exd_core.py", line 707, in run_experiments
    self.run_experiment_initialise()
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/exd/exd_core.py", line 466, in run_experiment_initialise
    self.perform_initial_queries()
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/exd/exd_core.py", line 350, in perform_initial_queries
    self._wait_for_a_free_worker()
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/exd/exd_core.py", line 497, in _wait_for_a_free_worker
    self.worker_manager.get_poll_time_real())
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/exd/exd_core.py", line 487, in _wait_till_free
    self._update_history(qinfo)
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/exd/exd_core.py", line 229, in _update_history
    self._exd_child_update_history(qinfo)
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/opt/blackbox_optimiser.py", line 95, in _exd_child_update_history
    self._update_opt_point_and_val(qinfo, query_is_at_fidel_to_opt)
  File "/home/albert_wei/WorkSpaces_2020/dragonfly-master/dragonfly/opt/blackbox_optimiser.py", line 118, in _update_opt_point_and_val
    if qinfo.val > self.curr_opt_val:
TypeError: '>' not supported between instances of 'str' and 'float'

How to fix this error? Thanks.

Confused by the response code and error message when preheating images to dragonfly from Harbor registry

See the original issue: goharbor/harbor#15924

Issue with specifying domain constraints

I am trying to maximize a function over an L1-ball. The final answer violates the domain_constraint. Also, if the min and max are changed to something other than 0 and 1 (like 3 and 4), it still returns a point in the unit cube. Possibly missing a step to convert back to raw parameters?

Python version: 3.6.8
Dragonfly: 0.1.4

Here is the code:

domain_vars = [{'name': 'x', 'type': 'float', 'min': 0, 'max': 1, 'dim': 3}]
domain_constraints = [
{'name': 'quadrant', 'constraint': 'np.linalg.norm(x, ord=1) <= 0.1'},
]

def objective(x):
    x = np.array(x)
    return np.linalg.norm(x, axis=1)

# Non-MF version
config_params = {'domain': domain_vars, 'domain_constraints': domain_constraints}
config = load_config(config_params)
max_num_evals = 10 # Optimisation budget (max number of evaluations)
# Optimise
opt_val, opt_pt, history = maximise_function(objective, config.domain,
                                           max_num_evals, config=config)

The output I get is the following:

opt_pt = [array([0.99993896, 0.99996948, 0.99999809])]
opt_val = 1.7319968488696031
dir(history) = ['class', 'contains', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'module', 'ne', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'slotnames', 'str', 'subclasshook', 'weakref', '_get_args', '_get_kwargs', 'curr_opt_points', 'curr_opt_points_raw', 'curr_opt_vals', 'curr_true_opt_points', 'curr_true_opt_vals', 'full_method_name', 'job_idxs_of_workers', 'num_jobs_per_worker', 'prev_eval_points', 'prev_eval_true_vals', 'prev_eval_vals', 'query_acqs', 'query_eval_times', 'query_points', 'query_points_raw', 'query_qinfos', 'query_receive_times', 'query_send_times', 'query_step_idxs', 'query_true_vals', 'query_vals', 'query_worker_ids']

Minor error in the example

The front page example loads the package via
from dragonfly.dragonfly import maximise_function
I think it should read
from dragonfly import maximise_function

However if I change the code accordingly and execute the face rec example:


from __future__ import print_function
import numpy as np
import math
from dragonfly import maximise_function
# Local imports
from demos.face_rec.face_rec import face_rec
domain_bounds = [[1, 500], [0, 1000], [0, 1]]
max_capital = 25
opt_val, opt_pt, history = maximise_function(face_rec, domain_bounds, max_capital,
                                               hp_tune_criterion='post_sampling',
                                               hp_tune_method='slice')

I obtain an error and the following trace:

/Users/markusmichaelrau/MyProjects/dragonfly/opt/gp_bandit.py in _set_up_acq_opt(self)
    207     acq_opt_method = self._get_acq_opt_method()
    208     if acq_opt_method in ['direct']:
--> 209       self._set_up_acq_opt_direct()
    210     elif acq_opt_method in ['pdoo']:
    211       self._set_up_acq_opt_pdoo()

/Users/markusmichaelrau/MyProjects/dragonfly/opt/gp_bandit.py in _set_up_acq_opt_direct(self)
    240     """ Sets up optimisation for acquisition using direct/pdoo. """
    241     if self.get_acq_opt_max_evals is None:
--> 242       lead_const = 1 * min(5, self.domain.get_dim())**2
    243       self.get_acq_opt_max_evals = lambda t: np.clip(lead_const * np.sqrt(min(t, 1000)),
    244                                                      1000, 3e4)

AttributeError: 'int' object has no attribute 'get_dim'

Great package, I love it!
Best,
Markus

Question on Thompson sampling

Hi,

Thanks a lot for developing this very neat package, and making it available to the community.

I had a quick question on Thompson sampling:
It seems that Thompson sampling is mentioned as one of the default acquisition functions in the Dragonfly paper, but it was later removed in this commit:
4be7e4c#diff-e8aec5f5adb76ab99374fb1d35983c4bR149
Is there anything to be aware of, when using thompson sampling? In general, is it recommended not to use Thompson sampling?

Thanks in advance for your help!

Drawing samples from Gaussian process

I'm quite curious about the Thompson sampling acquisition in Bayesian optimization, as I don't really understand how function samples can be drawn from Gaussian process

It seems that in dragonfly, the core of Thompson sampling is done by the draw_gaussian_samples function:

def draw_gaussian_samples(num_samples, mu, K):
  """ Draws num_samples samples from a Gaussian distribution with mean mu and
      covariance K.
  """
  num_pts = len(mu)
  L = stable_cholesky(K)
  U = np.random.normal(size=(num_pts, num_samples))
  V = L.dot(U).T + mu
  return V

But it seems that draw_gaussian_samples function is only drawing samples from predictive distributions of fixed input while cannot produce continuous function samples?

Heteroscedasticity in the multifidelity case

Can this package handle the case where the variance of the objective function differs for different fidelities? From the paper, it looks like the variance is a fixed parameter (η) but it's not clear how the value of this parameter is determined.

If it's not handled, I think it would be a great feature! There will be many cases where the variance is higher for lower fidelities. For example, I am optimizing the likelihood of a model on a large dataset. The fidelity would naturally correspond to the size, N, of the subset of the data you consider. In this case the variance will be proportional to 1/N. This information would be very valuable to the optimizer, no?

Use current history during function evaluations

Hi! Is there a way to use or refer to history during a function evaluation?

Parallel evaluation with ei

Hi!
I tried to run parallel Bayesian optimization for the branin function with following options & codes:

options = [
                   {'name':'capital_type','default':'return_value'}, 
                   {'name':'build_new_model_every', 'default': 17}, 
                   {'name':'init_capital','default':10}, 
                   {'name':'initial_method','default':'rand'},
                   {'name':'euc_init_method','default':'latin_hc'}, 
                   {'name':'acq', 'default':'ei'},
                   {'name': 'handle_parallel', 'default': 'halluc'}, 
                   {'name': 'acq_opt_max_evals', 'default': 3}, 
                   {'name':'domain_kernel_type', 'default':'matern'}, 
                   {'name':'domain_matern_nu', 'default':2.5} 
                   ]
        options = load_options(options)
        min_val, min_pt, history = minimise_function(func, domain,opt_method='bo', max_capital =    60, options=options)

in which func is the name of branin function and domain is the list of computation domain, but the acquisition function always return the same position for evaluation, same problem also happens with ucb acquisition function.
Evaluated inputs: [[-0.53138911 2.16230016] [ 8.00611623 3.71417474] [ 5.71641293 11.39269664] [ 5.1832132 14.4029029 ] [-2.62100231 5.23051845] [ 8.76691612 0.13967317] [ 1.67777453 12.11758913] [-4.62652079 9.34890705] [ 3.23838423 8.55028105] [ 0.08195912 6.08355315] [ 2.5 7.5 ] [ 6.25 3.75 ] [ 6.25 3.75 ] [ 2.5 7.5 ] [ 6.25 3.75 ] [ 2.5 3.75 ] [ 6.25 3.75 ] [ 6.25 3.75 ] [ 6.25 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 6.25 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [-1.25 3.75 ] [ 6.25 3.75 ] [-1.25 11.25 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 11.25 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ] [ 2.5 3.75 ]]
If I change the acq_opt_max_evals back to -1, or switch the acquisition function to ts then everything works well:
Evaluated inputs: [[ 6.48048264e+00 5.16685860e+00] [-1.20743781e-01 9.70861310e+00] [ 4.26112145e+00 7.47955916e+00] [-4.72670657e+00 7.80306038e+00] [ 3.07030981e+00 4.05256585e+00] [-1.86899555e+00 1.08101134e+01] [ 8.96877227e+00 1.21185092e+01] [-3.45598584e+00 1.09148145e+00] [ 2.12922567e+00 1.64593026e+00] [ 8.41270012e+00 1.42543831e+01] [ 2.90100098e+00 2.57904053e+00] [-4.99999999e+00 1.50000000e+01] [-3.59375000e+00 1.47656250e+01] [-4.99999821e+00 1.49999991e+01] [ 2.96875000e+00 2.34375000e+00] [-3.50889206e+00 1.27881145e+01] [-5.00000000e+00 1.50000000e+01] [ 2.94311523e+00 2.12219238e+00] [ 2.99827576e+00 2.81227112e+00] [ 3.17812009e+00 1.87500000e+00] [ 2.86293507e+00 2.60182142e+00] [ 2.92676700e+00 2.69622803e+00] [-1.25000000e+00 3.75000000e+00] [ 6.25000000e+00 3.75000000e+00] [ 6.25000000e+00 3.75000000e+00] [ 9.53125000e+00 3.28125000e+00] [-1.25000000e+00 1.12500000e+01] [ 2.50000000e+00 3.75000000e+00] [ 6.25000000e+00 1.03125000e+01] [ 1.00000000e+01 9.32647705e+00] [-3.30993652e+00 1.12481689e+01] [-1.25000000e+00 1.03125000e+01] [ 2.50000000e+00 8.57666016e+00] [-4.91210938e+00 1.06201172e+01] [-4.27455902e+00 1.06264114e+01] [-3.12500000e+00 1.31250000e+01] [ 2.50000000e+00 5.62500000e+00] [-3.45642112e+00 1.40625002e+01] [-2.10847855e-01 6.19105339e+00] [-2.10876493e-01 6.19166850e+00] [ 9.98535156e+00 7.49267578e+00] [-3.45626198e+00 1.40625002e+01]]
Could anyone help me about this issue? Did I miss anything for parallel evaluation?

performance issue

Hello there,

I'm trying to use the function with the default setting. However, I found the performance is extremely slow once #samples go over 500+, especially compared to TPE from hyperopt. Could you please point me a solution to this? Is the default setting good to use? thank you for the clarification.

Acquisition function sampling in ask-tell mode

Hi, I am currently trying out your code and its working very well.
However, I realised that when working in ask-tell mode, the acquisition function sampling seems to not update its weights. When running the provided example for the ask-tell mode from the docs and then looking at opt.history, the acquistition probabilities are all 0.25 and the acqs_to_use_counter dictionary contains only zeros. When trying to debug the code, I realised that the acquisition function indeed is sampled using the weights, but the weights do not get updated. Am I missing something or does the ask-tell mode currently not support updating the sampling weights?
Best regards
Jonas

Constraints do not work in code

I took the code in examples/synthetic/hartmann6_4/in_code_demo.py
and modified it to include the simple constraint from examples/synthetic/hartmann3_constrained/config.json

I basically just added this in place of the old config_params:

constraints = {
  "constraint_1": {
      "name": "quadrant",
      "constraint": "np.linalg.norm(x[0:2]) <= 0.5"
    }
}
config_params = {'domain': domain_vars, 'fidel_space': fidel_vars,
                   'fidel_to_opt': fidel_to_opt, 'domain_constraints': constraints}

I get this error when I try to run it:

Traceback (most recent call last):
  File "in_code_demo.py", line 58, in <module>
    main()
  File "in_code_demo.py", line 39, in main
    config = load_config(config_params)
  File "/home/chris/projects/dragonfly/dragonfly/exd/cp_domain_utils.py", line 102, in load_config
    domain_constraints=domain_constraints, domain_info=domain_info, *args, **kwargs)
  File "/home/chris/projects/dragonfly/dragonfly/exd/cp_domain_utils.py", line 262, in load_domain_from_params
    cp_domain = domains.CartesianProductDomain(list_of_domains, domain_info)
  File "/home/chris/projects/dragonfly/dragonfly/exd/domains.py", line 365, in __init__
    self._constraint_eval_set_up()
  File "/home/chris/projects/dragonfly/dragonfly/exd/domains.py", line 377, in _constraint_eval_set_up
    isinstance(self.domain_constraints[idx][1], str) and
KeyError: 0

I assume this should work. The same constraints work when calling with dragonfly-script.

I did a bit of digging and it seems like the constraints are not getting processed when loaded with load_config.

Error encountered while using Dragonfly #5 (Abortive)

See this issue: #33

Another error that aborts execution:

Traceback (most recent call last):
File "in_code_demo_alela2.py", line 37, in
main()
File "in_code_demo_alela2.py", line 32, in main
opt_val, opt_pt, _ = minimise_function(my_func_to_minimize, config.domain, max_capital, config=config) #alela
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 215, in minimise_function
max_val, opt_pt, history = maximise_function(func_to_max, *args, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 178, in maximise_function
max_capital, is_mf=False, options=options, reporter=reporter)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gp_bandit.py", line 1000, in gpb_from_func_caller
return optimiser.optimise(max_capital)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/blackbox_optimiser.py", line 189, in optimise
return self.run_experiments(max_capital)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 536, in run_experiments
self._asynchronous_run_experiment_routine()
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 491, in _asynchronous_run_experiment_routine
qinfo = self._determine_next_query()
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gp_bandit.py", line 495, in _determine_next_query
next_eval_point = select_pt_func(self.gp, anc_data)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 294, in asy_ttei
return _ttei(gp_eval, anc_data, ei_argmax)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 271, in _ttei
ref_mean, ref_std = gp_eval([ref_point])
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 55, in
return lambda x: _gp.eval(x, uncert_form=_uncert_form)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/gp/gp_core.py", line 170, in eval
K_tetr = self.kernel(X_test, self.X)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/gp/kernel.py", line 74, in call
return self.evaluate(X1, X2)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/gp/kernel.py", line 83, in evaluate
return self._child_evaluate(X1, X2)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/gp/kernel.py", line 530, in _child_evaluate
curr_X1 = get_idxs_from_list_of_lists(X1, idx)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py", line 45, in get_idxs_from_list_of_lists
return [elem[idx] for elem in list_of_lists]
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py", line 45, in
return [elem[idx] for elem in list_of_lists]
TypeError: 'NoneType' object is not subscriptable

parallelisation

In the case of using num-workers to introduce parallelization in the GP-UCB process is it possible to implement the Gaussian Process Adaptive Upper Confidence Bound which exploits the exploit parallelism
in an adaptive manner?

Errors encountered when using Dragonfly #2 (Abortive)

This is the second type of error abortive I found (AssertionError):

/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/gp/gp_core.py:184: RuntimeWarning: invalid value encountered in sqrt
uncert = np.sqrt(np.diag(post_covar))
Traceback (most recent call last):
File "in_code_demo_alela2.py", line 36, in
main()
File "in_code_demo_alela2.py", line 31, in main
opt_val, opt_pt, _ = minimise_function(my_func_to_minimize, config.domain, max_capital, config=config) #alela
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 215, in minimise_function
max_val, opt_pt, history = maximise_function(func_to_max, *args, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 178, in maximise_function
max_capital, is_mf=False, options=options, reporter=reporter)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gp_bandit.py", line 1000, in gpb_from_func_caller
return optimiser.optimise(max_capital)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/blackbox_optimiser.py", line 189, in optimise
return self.run_experiments(max_capital)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 536, in run_experiments
self._asynchronous_run_experiment_routine()
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 494, in _asynchronous_run_experiment_routine
self._dispatch_single_experiment_to_worker_manager(qinfo)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 404, in _dispatch_single_experiment_to_worker_manager
self.worker_manager.dispatch_single_experiment(self.experiment_caller, qinfo)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/worker_manager.py", line 189, in dispatch_single_experiment
qinfo = self._dispatch_experiment(func_caller, qinfo, worker_id, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/worker_manager.py", line 173, in _dispatch_experiment
qinfo = func_caller.eval_from_qinfo(qinfo, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/experiment_caller.py", line 221, in eval_from_qinfo
_, qinfo = self.eval_single(qinfo.point, qinfo, *args, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/experiment_caller.py", line 166, in eval_single
true_val = self._get_true_val_from_experiment_at_point(point)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/experiment_caller.py", line 146, in _get_true_val_from_experiment_at_point
assert self.domain.is_a_member(point)
AssertionError

Errors encountered when using Dragonfly #1 (Abortive)

I implemented successfuly the library and I try it several times.
The library went through the initialization process perfectly well in all trials.
After that it never completed the process (capital 200) en exited with this error (OverflowError: int too large to convert to float)

There were two kinds of errors: those interrupting the execution and those where execution continued. I will begin by showing on of the fatal ones. The process run two more loops before getting this, which I don't know what could mean. Any idea?

stable_cholesky failed with diag_noise_power=-8.
stable_cholesky failed with diag_noise_power=-7.
stable_cholesky failed with diag_noise_power=-6.
stable_cholesky failed with diag_noise_power=-5.
stable_cholesky failed with diag_noise_power=-4.
stable_cholesky failed with diag_noise_power=-3.
stable_cholesky failed with diag_noise_power=-2.
stable_cholesky failed with diag_noise_power=-1.
stable_cholesky failed with diag_noise_power=0.
stable_cholesky failed with diag_noise_power=1.
stable_cholesky failed with diag_noise_power=2.
stable_cholesky failed with diag_noise_power=3.
stable_cholesky failed with diag_noise_power=4.
**************** Cholesky failed: Added diag noise = -8.846730e-10
stable_cholesky failed with diag_noise_power=5.
**************** Cholesky failed: Added diag noise = -8.846730e-10
stable_cholesky failed with diag_noise_power=6.
**************** Cholesky failed: Added diag noise = -8.846730e-10
stable_cholesky failed with diag_noise_power=7.
...
...
...
... edited for readability, it just repeate the line before augmenting diag_noise_power value.

...
...
...
stable_cholesky failed with diag_noise_power=306.
**************** Cholesky failed: Added diag noise = -8.846730e-10
/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py:189: RuntimeWarning: overflow encountered in multiply
((10**diag_noise_power) * max_M) * np.eye(M.shape[0]))
stable_cholesky failed with diag_noise_power=307.
**************** Cholesky failed: Added diag noise = -8.846730e-10
stable_cholesky failed with diag_noise_power=308.
**************** Cholesky failed: Added diag noise = -8.846730e-10
Traceback (most recent call last):
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py", line 177, in stable_cholesky
L = np.linalg.cholesky(M)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/numpy/linalg/linalg.py", line 733, in cholesky
r = gufunc(a, signature=signature, extobj=extobj)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/numpy/linalg/linalg.py", line 92, in _raise_linalgerror_nonposdef
raise LinAlgError("Matrix is not positive definite")
numpy.linalg.linalg.LinAlgError: Matrix is not positive definite

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "in_code_demo_alela2.py", line 36, in
main()
File "in_code_demo_alela2.py", line 31, in main
opt_val, opt_pt, _ = minimise_function(my_func_to_minimize, config.domain, max_capital, config=config) #alela
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 215, in minimise_function
max_val, opt_pt, history = maximise_function(func_to_max, *args, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 178, in maximise_function
max_capital, is_mf=False, options=options, reporter=reporter)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gp_bandit.py", line 1000, in gpb_from_func_caller
return optimiser.optimise(max_capital)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/blackbox_optimiser.py", line 189, in optimise
return self.run_experiments(max_capital)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 536, in run_experiments
self._asynchronous_run_experiment_routine()
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_core.py", line 491, in _asynchronous_run_experiment_routine
qinfo = self._determine_next_query()
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gp_bandit.py", line 495, in _determine_next_query
next_eval_point = select_pt_func(self.gp, anc_data)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 127, in asy_ts
return maximise_acquisition(gp_sample, anc_data, vectorised=True)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 39, in maximise_acquisition
anc_data.max_evals, *args, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_utils.py", line 172, in maximise_with_method
return_history, *args, kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_utils.py", line 272, in maximise_with_method_on_cp_domain
return_history)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_utils.py", line 248, in _rand_maximise_vectorised_objective_in_cp_domain
rand_values = [obj(x) for x in rand_samples]
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/exd/exd_utils.py", line 248, in
rand_values = [obj(x) for x in rand_samples]
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 37, in
acquisition = lambda x: acq_fn([x])
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/opt/gpb_acquisitions.py", line 78, in
return lambda x: _gp.draw_samples(1, x).ravel()
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/gp/gp_core.py", line 251, in draw_samples
return draw_gaussian_samples(num_samples, mean_vals, covar)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py", line 222, in draw_gaussian_samples
L = stable_cholesky(K)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/utils/general_utils.py", line 189, in stable_cholesky
((10diag_noise_power) * max_M) * np.eye(M.shape[0]))
OverflowError: int too large to convert to float

List evaluation for multi-objective optimization

I have a case study where I have input data and a list of possible choices to evaluate/prioritize. On a single objective optimization, I can create a surrogate model and then evaluate the list of possible points to test with an acquisition function.

I am not sure what to do with a multi-objective optimization. Which "score" do I have to compute to evaluate the possible points on the multi-objective?
Is there an example with Dragonfly? I could not found one.

Bandit Problem

Hi, is there any example that I can refer to solve the multi-arm bandit problem thanks for the help.

Save and load progress

Hi,
thanks for the great library!
I was wondering if it is possible to save the state of minimize_function and restore it later.
I wrote some dragonfly scripts that run for a couple of days and it is very inconvenient when they get interrupted without any backup.
Thanks,
Ondrej

Advice on Multiple Objective Optimization (MOO)

Hi -

Thank you for this library. I am looking forward to implementing it.
I was recommended to use this library for MOO. Initially, I had a crude way of doing MOO where I was basically doing bayesian optimization for an objective f = (alpha) f1 - (beta) f2 so that I could maximize one objective and minimize another objective function with some weighting parameters (alpha and beta). I was told about pareto functions and that you can use these to make for a better MOO algorithm. Do you have any suggestions on how I can effectively use dragonfly to take the pareto front into consideration in my bayesian optimization algorithm?

Thanks,

constrained, multi-objective optimization in ask-tell mode

Hello,

I am looking to perform a optimization over a domain with constraints on some of the parameters, but I have had no luck getting the optimizer to obey the constraints in ask-tell mode (works perfectly with minimse_function interface). Was wondering if there was anything I was missing, or if there is even support for this yet. Below is a code snippet that I am currently working with.

# set dragonfly parameter space
domain_vars = [
    {'name': 'x0', 'type': 'float', 'min': 0, 'max': 1},
    {'name': 'x1', 'type': 'float', 'min': 0, 'max': 1},
]

domain_constraints = [
    {'name': 'dc1', 'constraint': constraint}
]

config_params = {'domain': domain_vars, 'domain_constraints': domain_constraints}
config = load_config(config_params)

domain , domain_orderings = config.domain, config.domain_orderings

for num_repeat in range(missing_repeats):

    func_caller = EuclideanFunctionCaller(None, domain)

    opt = gp_bandit.EuclideanGPBandit(func_caller, ask_tell_mode=True)
    opt.initialise()

    # optimize
    xs = []
    ys = []

    for _ in range(budget):
      x = opt.ask()
      print('X : ', x)
      y = surface(x) 
      opt.tell([(x, y)])
      print('x: %s, y: %s'%(x, y))

The above code unfortunately violates the cosntraint given below

def constraint(x):
    """ Evatuates the constraint """
    x0 = x[0]
    x1 = x[1]

    y0 = (x0-0.12389382)**2 + (x1-0.81833333)**2
    y1 = (x0-0.961652)**2 + (x1-0.165)**2

    if y0 < 0.2**2 or y1 < 0.35**2:
        return False
    else:
        return True

Im also curious about multiobjective optimization in ask-tell mode. Say I wanted to use my own scalarizing function, which converts the objective function measurements into a scalar-valued merit after each iteration according to some pre-defined tolerances set by the user. This function effectively adjusts the values of the merits for the entire optimization history according to new observations. To use this, I picture using Dragonfly in ask-tell mode, but instead of appending one input-output pair at each iteration with opt.tell([x, y]), I would instead need to update the entire history at each iteration. Is there any way that this could be done?

Thank you!

Support for no optimization function case?

Hi,

When I traced apis/opt.py, I found that there must exist an optimization function for function_caller.

What I would like to do is
0. get some initial samples, say X_init and Y_init

fit GP on X_init and Y_init
construct acquisition functions
optimize acquisition functions and get the next recommended sample
do REAL experiment on the recommended X to get REAL Y, and then add (X, Y_REAL) to sample pool (that's why I said no optimization function in my case: I need to do REAL experiment to get REAL Y)

How do I apply dragonfly to my situation? Any comments are highly appreciated.

load_config does not exist.

I've tried to run the in_code_demo.py in the supernova folder and I found that the load_config function is not available. Will it be added?

Ask-tell mode with `n_points > 0` goes to infinity looping

It got stuck in the while n_points > 0 loop of blackbox_optimiser.py

  # Methods for ask-tell interface  
  def ask(self, n_points=None):
    """Get recommended point as part of the ask interface.
    Wrapper for _determine_next_query.
    """
    if n_points:
      points = []
      while self.first_qinfos and n_points > 0:
        qinfo = self.first_qinfos.pop(0)
        if self.is_an_mf_method():
          if self.domain.get_type() == 'euclidean':
            points.append(self.func_caller.get_raw_fidel_domain_coords(qinfo.fidel, qinfo.point))
          else:
            points.append((self.func_caller.get_raw_fidel_from_processed(qinfo.fidel), 
              self.func_caller.get_raw_domain_point_from_processed(qinfo.point)))
        else:
          if self.domain.get_type() == 'euclidean':
            points.append(self.func_caller.get_raw_domain_coords(qinfo.point))
          else:
            points.append(self.func_caller.get_raw_domain_point_from_processed(qinfo.point))
        n_points -= 1
      new_points = []
      while n_points > 0:
        new_points.append(self._determine_next_query())

Travis builds failing with Python2 on OSX

See error below.

$ pyenv install $PYTHON
python-build: use openssl 1.1 from homebrew
python-build: use readline from homebrew
Downloading Python-2.7.12.tar.xz...
-> https://www.python.org/ftp/python/2.7.12/Python-2.7.12.tar.xz
Installing Python-2.7.12...
python-build: use readline from homebrew
ERROR: The Python ssl extension was not compiled. Missing the OpenSSL lib?
Please consult to the Wiki page to fix the problem.
https://github.com/pyenv/pyenv/wiki/Common-build-problems
BUILD FAILED (OS X 10.13.3 using python-build 20180424)
Inspect or clean up the working tree at /var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/python-build.20190407210012.86694
Results logged to /var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/python-build.20190407210012.86694.log
Last 10 log lines:
rm -f /Users/travis/.pyenv/versions/2.7.12/share/man/man1/python.1
(cd /Users/travis/.pyenv/versions/2.7.12/share/man/man1; ln -s python2.1 python.1)
if test "xno" != "xno"  ; then \
		case no in \
			upgrade) ensurepip="--upgrade" ;; \
			install|*) ensurepip="" ;; \
		esac; \
		 ./python.exe -E -m ensurepip \
			$ensurepip --root=/ ; \
	fi
The command "pyenv install $PYTHON" failed and exited with 1 during .

Batch experiments (multiple workers) within ask-tell interface

Hi, thanks for a great BO framework.

I have a question similar to Issue #52 in the sense that I want to do a real experiment on points suggested by dragonfly BO. I can see that the new ask-tell interface allows us to do that (following this demo).

My question is how to use this ask-tell interface in a situation of multiple workers? That is, could you please suggest settings to use with optimizers so that the line

x = opt.ask()

would return multiple (say 10) points to test at once? Then I would come back with 10 values of y to feed into opt.tell([(x, y)]) and proceed to the next iteration?

Thank you!

Could not import fortran direct library

I installed with:

pip.exe install git+https://github.com/dragonfly/dragonfly.git

It appeared to install successfully:

Successfully installed dragonfly-0.0.0 future-0.17.1

When I try to run the example, I have some import warnings, and then the example function does not appear to make the correct result.

>>> from dragonfly import minimise_function
Could not import Python optimal transport library. May not be required for your application.
Could not import fortran direct library.
>>> min_val, min_pt, history = minimise_function(lambda x: x ** 4 - x**2 + 0.1 * x, [[-10, 10]], 10);
Hyper-parameters for Algorithm -------------------------------------------------
  acq                              default
  acq_opt_max_evals                -1
  acq_opt_method                   default
  acq_probs                        adaptive
  add_group_size_criterion         sampled
  add_grouping_criterion           randomised_ml
  add_max_group_size               6
  build_new_model_every            17
  capital_type                     return_value
  esp_kernel_type                  se
  esp_matern_nu                    -1.0
  esp_order                        -1
  euc_init_method                  latin_hc
  get_initial_qinfos               None
  gpb_hp_tune_criterion            ml-post_sampling
  gpb_hp_tune_probs                0.3-0.7
  gpb_ml_hp_tune_opt               default
  gpb_post_hp_tune_burn            -1
  gpb_post_hp_tune_method          slice
  gpb_post_hp_tune_offset          25
  handle_non_psd_kernels           guaranteed_psd
  handle_parallel                  halluc
  hp_tune_criterion                ml
  hp_tune_max_evals                -1
  hp_tune_probs                    uniform
  init_capital                     default
  init_capital_frac                None
  init_method                      rand
  kernel_type                      default
  matern_nu                        -1.0
  max_num_steps                    10000000.0
  mean_func_const                  0.0
  mean_func_type                   tune
  mf_strategy                      boca
  ml_hp_tune_opt                   default
  mode                             asy
  next_pt_std_thresh               0.005
  noise_var_label                  0.05
  noise_var_type                   tune
  noise_var_value                  0.1
  num_groups_per_group_size        -1
  num_init_evals                   20
  perturb_thresh                   0.0001
  poly_order                       1
  post_hp_tune_burn                -1
  post_hp_tune_method              slice
  post_hp_tune_offset              25
  prev_evaluations                 None
  rand_exp_sampling_replace        False
  report_results_every             13
  shrink_kernel_with_time          0
  track_every_time_step            0
  use_additive_gp                  False
  use_same_bandwidth               False
  use_same_scalings                False
Capital spent on initialisation: 5.0000(0.5000).
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Anaconda3_64\lib\site-packages\dragonfly\utils\oper_utils.py:127: UserWarning: Attempted to use direct, but fortran library could not be imported. Using PDOO optimiser instead of direct.
  warn(report_str)
asy-bo(ei-ucb-ttei-add_ucb) (011/012) cap=1.100::  best_val=(e-0.000, t-0.000), acqs=[3 1 1 1],
>>> min_val
2.4868995751597324e-15
>>> min_pt
array([  2.48689958e-14])

Python version (Windows!):
Python 3.6.2 |Anaconda, Inc.| (default, Sep 19 2017, 08:03:39) [MSC v.1900 64 bit (AMD64)] on win32

Pip version: 19.0.2

Numpy version: 1.13.1

Passing integers to the function to be minimised

I've defined the domain this way:
def main():
""" Main function. """
domain_bounds = [[-5, 10], [0, 15]]
max_capital = 100
(as in branin in_code_demo.py)

My objective function need to get some integers as hyperparameters. Can we specify type of data into domain_bounds?
====some time passed and I tried this, following supernova:

14 def main():
15 """ Main function. """
16 #domain_bounds = [[-5, 10], [0, 15]]
17 #max_capital = 100
18 #domain_bounds = [[20070614,20070614], [8, 64], [4, 24], [8,64], [80,6400], [9,29]]
19 max_capital = 200
20
21 domain_vars = [{'name': 'date', 'type': 'int', 'min': 20070614, 'max': 20070614},
22 {'name': 'T', 'type': 'int', 'min': 8, 'max': 64},
23 {'name': 'B', 'type': 'int', 'min': 4, 'max': 24},
24 {'name': 'E', 'type': 'int', 'min': 8, 'max': 64},
25 {'name': 'sigmasquare', 'type': 'int', 'min': 80, 'max': 6400},
26 {'name': 'DeltaT', 'type': 'int', 'min': 9, 'max': 29}]
27 config_params = {'domain': domain_vars}
28 config = load_config(config_params)
29 #opt_val, opt_pt, _ = maximise_function(branin, domain_bounds, max_capital) #alela
30 #opt_val, opt_pt, _ = minimise_function(my_func_to_minimize, domain_bounds, max_capital) #alela
31 opt_val, opt_pt, _ = minimise_function(my_func_to_minimize, config.domain, max_capital, config=config) #alela
32 print('Optimum Value in %d evals: %0.4f'%(max_capital, opt_val))
33 print('Optimum Point: %s'%(opt_pt))
34
35 if name == 'main':
36 main()

But for some reason it aborts the execution with the error:

File "in_code_demo_alela2.py", line 37, in
main()
File "in_code_demo_alela2.py", line 32, in main
opt_val, opt_pt, _ = minimise_function(my_func_to_minimize, config.domain, max_capital) #alela
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 215, in minimise_function
max_val, opt_pt, history = maximise_function(func_to_max, *args, **kwargs)
File "/Users/alejandrosantillaniturres/Desktop/programming/python/virtualenv_ataa/lib/python3.7/site-packages/dragonfly/apis/opt.py", line 170, in maximise_function
domain_orderings=config.domain_orderings)
AttributeError: 'NoneType' object has no attribute 'domain_orderings'

Any idea what's going on?
Thank you!

How to make progress.p??

Hello. I have question about the progress.p
Because I want to use my data before run the algorithm.

I saw that progress.p and moo_progress.p. But there are no these files in repository.
It will be very thankful if you tell me how to make pickle file for use load_and_save function.

Strange behavior in first steps of multi-objective optimization

While playing around with multi-objective optimization, I observed the following behavior which seems strange to me. To explain it let me share this simple toy example with you, where the domain is discrete given by {0,...,9}^2, and the two objective functions are given by f1(x_1,x_2) = x_1 and f2(x_1,x_2)=x_2 without noise, which I want to jointly maximize. I initialize the optimization with the three points (0,0), (0,1) and (1,0) and I am interested in the pareto values during the optimization routine.

import dragonfly
import numpy as np
from argparse import Namespace
import pickle

##### Set parameters
domain = [[i//10, i%10] for i in range(100)]
domain_vars = [{'type': 'discrete_euclidean', 'items': domain}]
config_params = {'domain': domain_vars}
config = dragonfly.load_config(config_params)

##### Specify options
options = Namespace(
            build_new_model_every = 1,
            report_results_every = 1,
            report_model_on_each_build = True, 
	    gpb_hp_tune_criterion = 'ml',
            acq = 'ucb',
            dom_euc_kernel_type = 'se',
            noise_var_type = 'value',
            noise_var_value = 0.0,     
            is_multi_objective = 1)

# Create pickle file to later read for initialisation
train = {"points": [[[0,0]],[[0,1]],[[1,0]]], 
        "vals": [[0,0],[0,1],[1,0]], 
        "true_vals": [[0,0],[0,1],[1,0]]}
pickle.dump(train, open('train.p', "wb"))

# Load initialisation data
options.progress_load_from = 'train.p'

objectives = [lambda x: x[0][0], lambda x: x[0][1]]

pareto_values, pareto_points, history = dragonfly.multiobjective_maximise_functions(funcs = objectives,
                                                    domain = config.domain, 
                                                    max_capital = 0, 
                                                    opt_method = 'bo', 
                                                    options = options,
                                                    config = config)

print('\npareto_values: ', pareto_values)
print('pareto_points: ', pareto_points)
print('history.curr_pareto_vals: ', history.curr_pareto_vals)
print('history.curr_pareto_points: ', history.curr_pareto_points)
print('history.query_vals: ', history.query_vals)
print('history.query_points: ', history.query_points)

If max_capital is set to 0, then the output is

Multi-objective Optimisation with mobo(ucb) using capital 0.0 (return_value)
Loaded 3 data from files ['train.p'].
Legend: <iteration_number> (<num_successful_queries>, <fraction_of_capital_spent>):: #Pareto=<num_pareto_optimal_points_found>, acqs=<num_times_each_acquisition_was_used>
#000 (000, nan):: #Pareto: 2, acqs=[ucb:0], 

pareto_values:  [[0, 1], [1, 0]]
pareto_points:  [[[0, 1]], [[1, 0]]]
history.curr_pareto_vals:  []
history.curr_pareto_points:  []
history.query_vals:  []
history.query_points:  []

It seems strange to me that the values for pareto_values and history.curr_pareto_vals differ. Similar for the pareto_points. Can you explain me the difference between these two quantities?

Moreover, if max_capital is set to 1, then the output is

Multi-objective Optimisation with mobo(ucb) using capital 1.0 (return_value)
Loaded 3 data from files ['train.p'].
Legend: <iteration_number> (<num_successful_queries>, <fraction_of_capital_spent>):: #Pareto=<num_pareto_optimal_points_found>, acqs=<num_times_each_acquisition_was_used>
    -- GP-0 at iter 0: mu[#0]=0.5037, DomProd scale=0.16, SE: sc:1.0000 bws:[0.12 77.15], noise-var=0.000 (n=3)
    -- GP-1 at iter 0: mu[#0]=0.5037, DomProd scale=0.16, SE: sc:1.0000 bws:[77.15 0.05], noise-var=0.000 (n=3)
/home/maier/.virtualenvs/dragonfly/lib/python3.6/site-packages/dragonfly/gp/gp_core.py:187: RuntimeWarning: invalid value encountered in sqrt
  uncert = np.sqrt(np.diag(post_covar))
    -- GP-0 at iter 1: mu[#0]=0.5037, DomProd scale=0.16, SE: sc:1.0000 bws:[0.12 77.15], noise-var=0.000 (n=3)
    -- GP-1 at iter 1: mu[#0]=0.5037, DomProd scale=0.16, SE: sc:1.0000 bws:[77.15 0.05], noise-var=0.000 (n=3)
#001 (000, 0.000):: #Pareto: 2, acqs=[ucb:0], 
#002 (002, 2.000):: #Pareto: 1, acqs=[ucb:1], 

pareto_values:  [[1.0, 9.0]]
pareto_points:  [[array([1, 9])]]
history.curr_pareto_vals:  [[[1.0, 1.0]], [[1.0, 9.0]]]
history.curr_pareto_points:  [[[array([1, 1])]], [[array([1, 9])]]]
history.query_vals:  [[1.0, 1.0], [1.0, 9.0]]
history.query_points:  [[array([1, 1])], [array([1, 9])]]

What I don't understand here is why there are two query points even though max_capital=1. This can also be seen by the reports during the optimization which show two successful queries. I don't understand how the first query point is chosen. Can you please explain this to me?

Thanks in advance for your help!

"GP" and "GP bandit", what's the difference?

Hi,

I'm just curious about what's the difference between "GP" and "GP bandit".

To my knowledge, the term "GP bandit" seems that the input domain of GP is discrete. And "GP" is the situation where the input domain can be either continuous or discrete.

Please correct me if I'm wrong, and sorry for any inconvenience caused by this question.
Any further reference is also highly appreciated.

Best,
JM

Complete documentation for Options and Config

Hi, I'm trying to use dragonfly for some academic research, and found the overall interface, especially the ability to specify the domain as a JSON config file, very useful. My first instinct after running the examples was that I could re-use the various CLI/JSON parsing utilities in conjunction with minor changes to the python API to design custom experiments. However, I feel quite limited in my ability to fully use the package due to a lack of more complete documentation about the various possible features a config.json and options.json file can contain. I have already perused all the given examples, but the information is quite disjointed, and going through various pieces of code reveals more options and settings that are missing in the example JSON files but, presumably, can be specified in order to alter the behaviour of the package. I believe that it would massively improve useability of the code if a singular source of information for all such values were available, as opposed to having to browse through a massive amount of parsing code.

This would also extend to options that are configurable under the "config" and "options" namespaces but perhaps not accessible via the JSON files and only readable through the python API, as knowing what they are, their possible values, and their effects, would enable custom code to be written around them.

Could you kindly address this? On a related note, I noticed a number of options being "processed", thus resulting in subtle changes to them, including domain points and configuration file values. Again, it would massively improve code re-useability if concise documentation about this were available, as opposed to hunting down the individual parsing and processing code for every single such value.

Thank you.

How to define domain ordering?

It seems that to use discrete fidelities, I need to define the domain_ordering, but I can't find documentation about how the ordering should be defined.

dragonfly / dragonfly Goto Github PK

dragonfly's Introduction

Installation

Quick Start

Contributors

Acknowledgements

Citation

License

dragonfly's People

Contributors

Stargazers

Watchers

Forkers

dragonfly's Issues

Recommend Projects

Recommend Topics

Recommend Org