Giter Club home page Giter Club logo

paradox's Introduction

paradox

Package website: release | dev

Universal Parameter Space Description and Tools.

r-cmd-check CRAN Status StackOverflow Mattermost

Installation

remotes::install_github("mlr-org/paradox")

Usage

Create a simple ParamSet using all supported Parameter Types:

  • integer numbers ("int")
  • real-valued numbers ("dbl")
  • truth values TRUE or FALSE ("lgl")
  • categorical values from a set of possible strings ("fct")
  • further types are only possible by using transformations.
ps = ParamSet$new(
  params = list(
    ParamInt$new(id = "z", lower = 1, upper = 3),
    ParamDbl$new(id = "x", lower = -10, upper = 10),
    ParamLgl$new(id = "flag"),
    ParamFct$new(id = "methods", levels = c("a","b","c"))
  )
)

Draw random samples / create random design:

generate_design_random(ps, 3)
#> <Design> with 3 rows:
#>    z         x  flag methods
#> 1: 1  7.660348 FALSE       b
#> 2: 3  8.809346 FALSE       c
#> 3: 2 -9.088870 FALSE       b

Generate LHS Design:

requireNamespace("lhs")
#> Loading required namespace: lhs
generate_design_lhs(ps, 3)
#> <Design> with 3 rows:
#>    z         x  flag methods
#> 1: 1 -3.984673  TRUE       b
#> 2: 2  7.938035 FALSE       a
#> 3: 3  1.969783  TRUE       c

Generate Grid Design:

generate_design_grid(ps, resolution = 2)
#> <Design> with 24 rows:
#>     z   x  flag methods
#>  1: 1 -10  TRUE       a
#>  2: 1 -10  TRUE       b
#>  3: 1 -10  TRUE       c
#>  4: 1 -10 FALSE       a
#>  5: 1 -10 FALSE       b
#>  6: 1 -10 FALSE       c
#>  7: 1  10  TRUE       a
#>  [ reached getOption("max.print") -- omitted 18 rows ]

Properties of the parameters within the ParamSet:

ps$ids()
#> [1] "z"       "x"       "flag"    "methods"
ps$levels
#> $z
#> NULL
#> 
#> $x
#> NULL
#> 
#> $flag
#> [1]  TRUE FALSE
#> 
#> $methods
#> [1] "a" "b" "c"
ps$nlevels
#>       z       x    flag methods 
#>       3     Inf       2       3
ps$is_number
#>       z       x    flag methods 
#>    TRUE    TRUE   FALSE   FALSE
ps$lower
#>       z       x    flag methods 
#>       1     -10      NA      NA
ps$upper
#>       z       x    flag methods 
#>       3      10      NA      NA

Parameter Checks

Check that a parameter satisfies all conditions of a ParamSet, using $test() (returns FALSE on mismatch), $check() (returns error description on mismatch), and $assert() (throws error on mismatch):

ps$test(list(z = 1, x = 1))
#> [1] TRUE
ps$test(list(z = -1, x = 1))
#> [1] FALSE
ps$check(list(z = -1, x = 1))
#> [1] "z: Element 1 is not >= 1"
ps$assert(list(z = -1, x = 1))
#> Error in ps$assert(list(z = -1, x = 1)): Assertion on 'list(z = -1, x = 1)' failed: z: Element 1 is not >= 1.

Transformations

Transformations are functions with a fixed signature.

  • x A named list of parameter values
  • param_set the ParamSet used to create the design

Transformations can be used to change the distributions of sampled parameters. For example, to sample values between $2^-3$ and $2^3$ in a $log_2$-uniform distribution, one can sample uniformly between -3 and 3 and exponentiate the random value inside the transformation.

ps = ParamSet$new(
  params = list(
    ParamInt$new(id = "z", lower = -3, upper = 3),
    ParamDbl$new(id = "x", lower = 0, upper = 1)
  )
)
ps$trafo = function(x, param_set) {
  x$z = 2^x$z
  return(x)
}
ps_smplr = SamplerUnif$new(ps)
x = ps_smplr$sample(2)
xst = x$transpose()
xst
#> [[1]]
#> [[1]]$z
#> [1] 0.125
#> 
#> [[1]]$x
#> [1] 0.4137243
#> 
#> 
#> [[2]]
#> [[2]]$z
#> [1] 0.5
#> 
#> [[2]]$x
#> [1] 0.3688455

Further documentation can be found in the mlr3book.

paradox's People

Contributors

bblodfon avatar be-marc avatar berndbischl avatar github-actions[bot] avatar jakob-r avatar mb706 avatar mboecker avatar michaelchirico avatar mllg avatar pat-s avatar pfistfl avatar sebffischer avatar smilesun avatar sumny avatar web-flow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paradox's Issues

dependent params

can we please say how we are going to handle those?

because i currently see nothing in the package about this. and this is actually the hard part

shorthand notation for param sets

we could use the PCS definition and parse string

sort-algo{quick,insertion,merge,heap,stooge,bogo} [bogo]
quick-revert-to-insertion{1,2,4,8,16,32,64} [16]
quick-revert-to-insertion|sort-algo in {quick}

Add temporary params to Learners / Pipeops

Usecase1: you want to tune RF::mtry, but from [0, 1] as percentage, not as an integer from 1..k

Usecase 2: You have created somesmart heuristics to set params, like a,b,c which execute some code.

Proposal: You can add a HP + some piece of code to a pipeop, which maps the setting of that new param to values of already existing ones. you then can use (or tune) the new one

please do not use "sample" so often in unit tests

this makes no sense. "sample" is a stochastic function. for unit tests it is much better to have predefined objects.
it might be ok to use this a couple of times, but it is uised very often, while it it should probably ONLY be called in test_sampler.R

generate design interface

the code should not be in the parmset, but seperate
it might simply look like this

generate_design_lhs(par_set)
generate_design_random(par_set)
generate_design_grid(par_set)

but we need to have an "augment" function?

implement pcs format parser

this is not urgent but kills at least 2 birds with one stone

a) we can parse 3rd party PCS files
b) we have a way to write down param sets with much less typing

we probably dont need more syntactic sugar then for other abbreviations

Opt path issues

OptPath

we should have as.data.table, as.data.frame simply calls this

add: remove message, transform_x

OptPath should have a tranform_x method

potentially remove denorm function

a) thats seems specific to LHS designs? or is there any other use?

b) at least its name is bad

c) i dont even know ehether we should support LHS. i have never seen evidence that they are that worthwhile, and they are hard to even define properly for complex spaces

conditional params

  1. to a paramset we can add conditions:
cond = Condition$new(child=param, parent=param, cond = cond_equal(rhs))
ps$add_condition(cond)
rhs is some value / list of values

can be implemented with

cond_eq = function(rhs) { <return operator ==, with added attribute "rhs">}
  1. we can now easily implement a "check" for feasibility:
    go thru all params. if they have no parent, check their value. if they have a parent, check that parent first for its condition.

  2. we can easily construct a tree from the condition.
    have a list S and T. S is all params, T is empty. take an element from S.
    if it either has no parent or the parent is in T, create a node, put the param in T. link the node to its parent.

  3. sampling can easily be implemented with rejection sampling. or we can use the tree

export functions

Some functions are not exported although they seem to be needed
e.g.
ParamTreeFac

@smilesun Please check that the vignettes run with an installed version of phng.

Check public methods of ParamSetTree

Do we need the following to be public?

  • setRootHandle (it is not even used anywhere)
  • asample
  • asample.render2str (completely remove that)
  • getFlatList - i like the option to get a nested list, but this should then be sampleList and directly sample. Or is ist exactly what getRecursiveList() does?

an example to make dependent tuning work

If we concatenate a filterwrapper together with a tune wrapper and want to co-tune the hyper-parameters for the learner (random forest) and the fw.perc. Currently it seems to be difficult to do, since the fw.perc comes first which will select the number of features that will be fed into the tuneWraper, but the mtry parameter in random forest decides on the number of features which is variable for each iteration of the tuning. Our new version should make this kind of tuning easier to be done.

ps.ranger = function(p) { 
  makeParamSet(
    # FIXME: mtry must depend on the other parameter "fw.perc"
    # makeIntegerParam("mtry", lower = as.integer(p/10), upper = as.integer(p/1.5)),
    #makeIntegerParam("min.node.size", lower = 1L, upper = 50L, default = 5L),
    #makeIntegerParam("num.trees", lower = 100, upper = 5000, default = 500L),
    makeNumericParam("sample.fraction", lower = 0.1, upper = 1, default = 0.5),
    # makeDiscreteParam("fw.perc", values = PERF_GRID))
    makeNumericParam("fw.perc", lower = 0.001, upper = 0.8))  # feature selection percentage
}


Deep copy of the ParamSet

I have a problem with making the copy of the ParamSet class because it seems that does not make a real deep copy. See the example below:

library(paradox)
#> Loading required package: data.table

pl1 <- paradox::ParamSet$new(
  params = list(
    ParamInt$new("a"),
    ParamInt$new("b")
  )
)

pl1$params
#> $a
#> a [integer]: {-Inf, ..., Inf}
#> 
#> $b
#> b [integer]: {-Inf, ..., Inf}
pl2 = pl1$clone(deep = TRUE)

pl2list <- pl2$params
invisible(
  lapply(
    pl2list,
    function(x) x$id = paste("x", x$id, sep = ":"))
)

# new param list with renamed parameters:
# so far so good.
ParamSet$new(params = pl2list)
#> ParamSet: parset 
#> Parameters: 
#> x:a [integer]: {-Inf, ..., Inf}
#> x:b [integer]: {-Inf, ..., Inf}

# However the ids in pl1 were also changed:(
pl1
#> ParamSet: parset 
#> Parameters: 
#> x:a [integer]: {-Inf, ..., Inf}
#> x:b [integer]: {-Inf, ..., Inf}

The explanation appears to quite simple - R6 makes a copy of the params list, but not the elements in that list (I think that this behavior of R6 makes sense because it should not examine every data structure in each field to check if there's something to copy). However, in the ParamSet case, I think that all the elements of the params list should be copied when the deep parameter is set to true.

What do you think?

allow user to set a concrete value to a Parameter

I think it might be good if the user could set a value to a ParamInt for example and so he/she can set the whole hyper-paramset mannually. This feature will allow phng to be used in any R package as a option manager.

Function to print x values

Function values are stored in named lists.
To transform them to a single string you could use paste(names(x), x, sep = "=" ,collapse=",")
This is problematic for

  • Long values
  • x values that can not be transferred to a character. These should not exists, because complex types are just created by transformation. But we have a untyped param class.
  • Real valued numbers with many decimal places.
    because they can mess up the output.

So we want to shorten and format some of them.
Formatting and shortening should be configurable.

Each ParamNode should be able to transform a named list to a character.
I propose Param(Set/Real/...)$value_to_string(x).

FIXMEs to issues

i left some fixmes in the code while traveling.
i should convert them to issues very soon

special.vals dont work at all

x = ParamReal$new("x", lower = 1, special.vals = list("a"))

that does not even create an object.

  • the "test" function must be unit tested
  • it needs to be documented what the effect on "sample" is.
  • it needs to be documented what the effect on "denorm" is

Sampling of values

should not be in paramset, and params. but seperate.

we also need to be able flexibly implement different samplers.

this might work

A couple of mini classes like this:

PSamplerIntUnif
PSamplerNumNormal
PSamplerNumUnif
they all inherit from PSampler, implement p$sample(n)

Then we have this:
pss = ParamSetSamplerIndep$new(list of samplers)
pss$sample(10)

inherits from ParamSetSampler

ParamSet Vectorized Test for Feasability

The expressions in the requirements are supposed to work vectorized.

However for the checkmate assert/test/check functions we only evaluate them on single values. Somehow we would like to have a TRUE/FALSE vector for a bunch of values.

remove restriction stuff now

seems not to be consitently implemented? if it works and is "simple" enough it can stay!

"forbidden" might be a better name

remove toStringVal

I would like to hide the fact that the tree param set contains actual values from the user.

createCollectionParamSet is a bad name

probably repParam is better.

one could also think about rep.Param, and implement a new S3 method, but as the id is changed this kindof a violation of the fact that R in other cases simply copies the object?

or call it vectorizeParam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.