mlr-org / mlrmbo Goto Github PK
View Code? Open in Web Editor NEWToolbox for Bayesian Optimization and Model-Based Optimization in R
Home Page: https://mlrmbo.mlr-org.com
License: Other
Toolbox for Bayesian Optimization and Model-Based Optimization in R
Home Page: https://mlrmbo.mlr-org.com
License: Other
We want to include the time aspect into both uni and multi crit, e.g. we want to have things like expected improvement per minute.
One special application: In multicrit: Include this while the time is one target function and while we want to propose multiple points -> here the time of one iteration is the maximum time of one of the proposed function evals.
We need to think about what we can do here.
Setting a design by hand requires the user to set the trafo attribute manually (initial design must not be transformed). There must be a better solution. Maybe
Should be done in checkStuff function
Do this later when other stuff is finished
Also thin about interface.
We have to adapt the code for the case of the discrete parameters. E.g. after generating of the lhs initial design we have to transform the output to the right discrete values.
Since R version 3.0.2 the BSD license is deprecated. This is noted by make check. We should change to BSD_3_clause or BSD_2_clause plus license file as suggested by the R team. BSD 3 seems to be appropriate. Any suggestions?
They currently take quite a lot of time.
But only do this where can check the same thing with less time!
Hi,
By some typing errors we get some misleading error messages.
For example for typing errors in “infill.opt” following message is shown:
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'lcb' of mode 'function' was not found
Because of this code in ProposePoints:
infill.opt.fun = switch(control$infill.opt,
cmaes = infillOptCMAES,
focussearch = infillOptFocus,
ea = infillOptEA,
# default: try to match the fun which is given as string
match.fun(control$multipoint.method))
By an error in infill.crit we get following message which is mutch usefull as the previous one but still not clear:
Error in checkArg(infill.opt, "character", len = 1L, na.ok = FALSE) :
object 'infill.opt' not found
infill.crit="ei" does not work with any model (at the moment just with kriging model)
How to do that effectively? This sanity check should be performed in makeMBOControl, but this function knows nothing about the used learner.
mlrMBO currently supports the parallelization of some internal funs with Bernds parallelMap package. This should be briefly mentioned in the tutorial.
random.points ---> focussearch.points
infill.opt="random" ---> "focussearch"
todo-files/parego.R
74: infill.opt="random", infill.opt.random.points=1000)
shiny/server.R
79: infill.crit="ei", infill.opt="random", infill.opt.random.points=2000)
test_src.R
48: # infill.opt.random.maxit=5, infill.opt.random.points=1000L)
inst/examples/ex_1d_1.R
24: infill.crit="ei", infill.opt="random", infill.opt.random.points=500)
inst/examples/ex_autoplot.R
24:# infill.crit="ei", infill.opt="random", infill.opt.random.points=500)
51:# ctrl = makeMBOControl(init.design.points=20, iters=5, infill.opt.random.points=100, noisy=TRUE)
65: infill.crit="ei", infill.opt="random", infill.opt.random.points=2000)
inst/examples/ex_1d_3.R
34: infill.opt.random.points=100, noisy=TRUE)
inst/examples/ex_2d_1.R
20: infill.crit="ei", infill.opt="random", infill.opt.random.points=2000)
inst/tests/test_misc.R
18:# ctrl = makeMBOControl(minimize=FALSE, infill.crit="mean", iters=30, infill.opt.random.points=100)
70:# opt = mbo(fit, ps, learner = surrogate, control = makeMBOControl(infill.opt.random.points=10))
inst/tests/test_exampleRun.R
10:# infill.opt="random", infill.opt.random.points=10)
inst/tests/test_mbo_impute.R
21: ctrl = makeMBOControl(iters=20, infill.opt.random.points=500)
23: ctrl = makeMBOControl(iters=20, infill.opt.random.points=500, impute=function(x, y, opt.path) 0)
25: ctrl = makeMBOControl(iters=50, infill.opt.random.points=500)
27: ctrl = makeMBOControl(iters=50, infill.opt.random.points=500, impute=function(x, y, opt.path) 0, impute.errors=TRUE)
Atm, the log output is rounded to 2 digets, like
[mbo] 0: cost=14092.26; gamma=0.00; epsilon=0.01 : error=0.762, execTime=2.464
[mbo] 0: cost=0.01; gamma=0.00; epsilon=0.00 : error=0.028, execTime=24374.496
[mbo] 0: cost=0.00; gamma=0.18; epsilon=0.00 : error=0.028, execTime=22879.370
as you see, this is bad, because many parameters are rounded to 0.00. We want output like 1.44e-4.
We don't want to have 2 functions for "normal" mbo and parEGO, but one function and a method-parameter. This is not crucial yet, but after implementing some more multicrit methods this should be done.
This would include: Renaming the old mbo-function into soMBO (single objective), writing a new function mbo with exactly the same interface, that just makes some param checks and than calls the regarding real function and introducing a new method param.
We need a function to plot the optimization process of exampleRun objects for mixed discrete and numeric functions.
< I WILL UPDATE THIS WITH NEW GOOD IDEAS FROM BELOW BUT ANSWER BELOW! >
We need to discuss how error are handled in the package. This is important as we otherwise lose most of the info of long optimization runs.
There are errors of multiple kinds
FE1) Function Eval: Exception occurs. Could catch this.
FE2) Function Eval: Crash that kills the entire R process. Problematic if the eval was done in the same process where mbo runs.
FE3) Function does not return. Because it does not terminate, or terminate only after 100 years.
MBO1) Exception in our own code happens. Should not, but could.
MBO2) Total crash of our own code.
Options to handle this:
O1) Always store the all relevant information from mbo (optpath, learner, control object and so) on the master in an RDATA file every k iterations. This helps will ALL errors above in the way that we do not lose PRIOR information. It does not help with the fact that the optimization stops.
Implement this in any case as a user option.
We can even try to code a warm-start / continue function.
O2) Catch FE1 error via "try". Warn on the console about it, log message to opt.path. Impute value of eval. Handles FE1 completely, but nothing else.
O3) Run evals in separate R process (with walltime). Then basically do the same in O2. Handles FE1 and FE2 both, FE2 only if we can specify a timeout.
Incurs overhead and is much harder to implement.
We basically have this already for free in parallelMap / BJ mode when we do multipoint evals. Maybe we could be tricky and use this as well for single-point evals.
Should be discussed.
O4) When an error occurs, either FE1 or MBO1, and the user does want to or cannot impute values, we could return the opt.path some how, either in global mem or on disk. Maybe this should simply be combined with O1. Simply do a final "store-on-disk" then.
Against MBO1 and 2 we cannot do much except O1 / O4.
Discuss!
Why is evalTargetFunction() such complex? Can we get rid of "...", because we always get an error here, if we want to evaluate mbo() function step by step.
Regarding the "fun" argument of evalTargetFunction() function: "Fitness function to minimize. The first argument has to be a list of values." Here is unclear whether each parameter has to be a list entry or the whole set, e.g., list(c(22,13)) or list(22,13)?
I would propose to get rid of the list condition and switch to vector. In this case
user can define the objective functions more easy. For example, with the list condition it is impossible to define BBOB functions as follows:
objfun1=generate_ackley_function(dimensions=5) as the first argument is a vector and not a list.
To summarize, at the moment leads the function evalTargetFunction() to the most error messages by trying to apply mbo() function.
because of changes to evalTargetFunction
whole code needs to be reviewed anyway
@param infill.opt [\code{character(1)}]\cr
The name "random" might be a bit confusing as one can think, here the next point is chosen randomly. Is it not better to name this option "seq.design"?
Is the option "ei" still active (meaningful), as we have now "infill.crit" parameter of makeMBOControl() function?
if we do not have that already.
actually we would want to learn this during the search....
It's called mlrMBO
for a reason, right? But neither in the mlr
nor in the mlrMBO
tutorial it is shown how to optimize an mlr
learner using mlrMBO. But in mlr
we have something unexported like makeTuneControlMBO
.
How to proceed?
I started to restructure the tutorial. Going to work on it this week.
I think we talked about this a while ago: At the moment, the control object and its constructor are ugly, huge things and theye are going to grow even more. I'm working on the new feature for parEGO we discussed last friday, this will add another parameter. And it won't stop growing. I doubt anyone except for us can overview this mass of parameters. Most new features we implement add one or more new parameters, and we dump everything into this one function / object.
What we shoud do (not now, but in the near future) is restructure this a bit. We don't want to change our intern usage of the control object - we just want to have a better user interface.
We could split the function into useful parts, likte one part for infill.crit options, one part for multipoint proposal, one part for multicrit, etc., since some parameters will never be set at the same time. E.g. the multipoint and the parEGO params can't be set at the same time.
And in the end there would be the makeControlObject function to set some main params and to fuse them with the specialized control objects.
must be removed, mlr code must be used.
think about pascal remarks with 2*max
does all work?
This includes 2 things
iterations in sequence
points in multipoint
we need to at least warn, but maybe repair this. it is unclear whether this actually happens at all.
Failure: mbo works correctly with and without initial design
mbo(f, ps, des, learner, ctrl) code did not generate an error
in particular it is the test PROVIDE INITIAL DESIGN WITH TRAFO
Hi, the following setting causes errors:
library(mlrMBO)
library(soobench)
objfun=generate_branin_function()
ps = makeNumericParamSet(len = number_of_parameters(objfun1), lower = lower_bounds(objfun1), upper = upper_bounds(objfun1))
design.x=generateDesign(30, ps)
y=apply(design.x,1,objfun)
design=cbind(design.x,y)
attr(design, "trafo")=FALSE
learner_km=makeLearner("regr.km", predict.type="se", covtype="matern3_2",nugget.estim=TRUE)
ctrl = makeMBOControl(
iters = 50,
infill.crit="ei",
init.design.points=10,
infill.opt="focussearch")
m=mbo(makeMBOFunction(objfun), design=design, par.set=ps, learner=learner_km, control=ctrl, show.info=TRUE)
I have found that it lies on the function generateMBODesign, line 57:
if (all(y.name %in% colnames(design.x)))
As design.x can not contain y names (see line 40: design.x = dropNamed(design, y.name))
I changed lines 57-58 as follows:
if (all(y.name %in% colnames(design))) {
design.y = data.frame(design[, y.name])
names(design.y)=y.name
And the lines 70-71 as follows:
ys = convertRowsToList(as.vector(design.y))
Map(function(x,y) addOptPathEl(opt.path, x=x, y=unlist(y), dob=0), xs, ys)
With these changes we do not get the error message more.
If you find the changes ok I will commit them.
The following produces an error.
library(mlrMBO)
objfun = function(x) 1
par.set = makeParamSet(
makeNumericParam("x", lower=0,upper=1),
makeIntegerParam("k", lower=1, upper=2)
)
control = makeMBOControl(
iters = 1,
init.design.args=list(k=3, dup=4)
)
learner_rf = makeLearner("regr.randomForest")
mbo(objfun, par.set, control=control, learner = learner_rf)
Error in (function (n, k, dup = 1) :
formal argument "k" matched by multiple actual arguments
8: (function (n, k, dup = 1)
{
if (length(n) != 1 | length(k) != 1 | length(dup) != 1)
stop("n, k, and dup may not be vectors")
if (any(is.na(c(n, k, dup))))
stop("n, k, and dup may not be NA or NaN")
if (any(is.infinite(c(n, k, dup))))
stop("n, k, and dup may not be infinite")
if (n != floor(n) | n < 1)
stop("n must be a positive integer\n")
if (k != floor(k) | k < 1)
stop("k must be a positive integer\n")
if (dup != floor(dup) | dup < 1)
stop("The dup factor must be a positive integer\n")
result <- numeric(k * n)
result2 <- .C("maximinLHS_C", as.integer(n), as.integer(k),
as.integer(dup), as.integer(result))[[4]]
eps <- runif(n * k)
result2 <- (result2 - 1 + eps)/n
return(matrix(result2, nrow = n, ncol = k, byrow = TRUE))
})(n = 20L, k = 2L, k = 3, dup = 4)
7: do.call(fun, c(list(n = n, k = k), fun.args))
6: generateDesign(control$init.design.points, par.set, control$init.design.fun,
control$init.design.args, trafo = FALSE)
5: mbo(objfun, par.set, control = control, learner = learner_rf) at bla.R#21
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("C:/Users/Karin/Desktop/bla.R")
Same for multipoint proposal
The following produces an error I dont get:
library(mlrMBO)
set.seed(1)
objfun = function(x, ...) rnorm(1)
ps = makeParamSet(
makeNumericParam("sstep", lower=0.8, upper=1),
makeNumericParam("distanz", lower=0.5, upper=0.8)
)
lrn = makeLearner("regr.km", predict.type="se", nugget.estim=TRUE)
ctrl = makeMBOControl(
init.design.points = 8,
iters = 1,
infill.crit = "lcb",
infill.opt = "cmaes"
)
res = mbo(objfun, ps, learner=lrn, control=ctrl)
Computing y column for design. Was not provided
[mbo] 0: sstep=0.94; distanz=0.52 : y=-0.059
[mbo] 0: sstep=0.92; distanz=0.62 : y= 1.100
[mbo] 0: sstep=0.84; distanz=0.66 : y= 0.763
[mbo] 0: sstep=0.89; distanz=0.73 : y=-0.165
[mbo] 0: sstep=0.96; distanz=0.57 : y=-0.253
[mbo] 0: sstep=0.80; distanz=0.80 : y= 0.697
[mbo] 0: sstep=0.98; distanz=0.72 : y= 0.557
[mbo] 0: sstep=0.86; distanz=0.59 : y=-0.689
Fehler in t.default(results[[j]]$par) : Argument ist keine Matrix
> traceback()
10: t.default(results[[j]]$par)
9: t(results[[j]]$par)
8: as.data.frame(t(results[[j]]$par))
7: infill.opt.fun(infill.crit.fun, model, control, par.set, opt.path,
design)
6: proposePoints(model, par.set, control, opt.path)
5: mbo(objfun, ps, learner = lrn, control = ctrl) at MBOTest2.R#21
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("MBOTest2.R")
Example with kknn:
library(mlrMBO)
fun = function(x)
sin(x$num2) + ifelse(x$disc1 == "a", sin(x$num1), 0)
ps = makeParamSet(
makeDiscreteParam("disc1", values = c("a", "b")),
makeNumericParam("num1", lower = 0, upper = 1,
requires = quote(disc1 == "a")),
makeNumericParam("num2", lower = 0, upper = 1)
)
res = mbo(fun, ps,
learner = makeBaggingWrapper(makeLearner("regr.kknn"), 10L, predict.type = "se"),
control = makeMBOControl( init.design.points = 20,
iters = 10,
infill.crit = "ei"))
I can think of 3 possible solutions:
At the moment I have a problem that my objective function
Here is a very simple and foolish example:
objfun=function(listOfValues,akt.best.y)
{
x1=listOfValues$x1
x2=listOfValues$x2
if((x1+x2)<akt.best.y) {S="ok"} esle {S="not ok"}
y=x1_3+5_x2-2
return(list(y=y,S=S))
}
Regarding the first point:
I will have to adapt the mbo function for my example. I tried to do following in the mbo loop , but it does not work (of course) :
akt.best= max(getOptPathY(opt.path, y.name, drop = TRUE))
evals = evalTargetFun(fun, par.set, xs, opt.path, control, show.info, oldopts, akt.best, ...)
Is it possible to implement my problem without changing the evalTargetFun?
A lot of learners will fail to predict with discrete Params because of dropped factor levels, here are some examples:
library(mlrMBO)
ps = makeParamSet(makeNumericVectorParam("x", len = 5, lower = 0, upper = 1),
makeDiscreteParam("z", values = 1:10))
f = function(x) sum(x$x) + as.numeric(x$z)
mbo(f, ps,
learner = makeBaggingWrapper(makeLearner("regr.kknn"), 10L, predict.type = "se"),
control = makeMBOControl( init.design.points = 20,
iters = 10))
mbo(f, ps,
learner = makeBaggingWrapper(makeLearner("regr.lm"), 10L, predict.type = "se"),
control = makeMBOControl( init.design.points = 20,
iters = 10))
mbo(f, ps,
learner = makeBaggingWrapper(makeLearner("regr.blackboost"), 10L, predict.type = "se"),
control = makeMBOControl( init.design.points = 20,
iters = 10))
mbo(f, ps,
learner = makeBaggingWrapper(makeLearner("regr.nnet"), 10L, predict.type = "se"),
control = makeMBOControl( init.design.points = 20,
iters = 10))
Failure: infill crits
or$y < 1 isn't true
We need to thoroughly test dependent params in unit tests and the impute wrapper.
Currently mlrMBO development is based on the "old" way with copy & paste of helpful development tools.
Hi,
where to find this function (estimateResidualVariance). It is called in infillCritAEI.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.