Giter Club home page Giter Club logo

hal-cgp's People

Contributors

henrikmettler avatar jakobj avatar mschmidt87 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hal-cgp's Issues

Add tests for gradient-based local search

test cases should cover at least:

  • increase in fitness in a single step
  • stability of optimal solution, i.e., a single does not take you away if the parameters are already optimal

Add more nodes

As an extension to the nodes currently implemented, add nodes that perform:

  • comparisons: less than (<) and greater than (>)
  • if/else branches
  • binary operators: AND, OR, NOT

Add consistency check to caching decorator

Currently, the caching decorator does not check whether the function to be cached is actually the same as the one that the cache was created for. Thus, it can happen that the decorator mistakes a different function for the cached function.

This is a minimal reproducer:

import gp


@gp.utils.disk_cache("example_caching_cache.pkl")
def f1(x):
    return x


@gp.utils.disk_cache("example_caching_cache.pkl")
def f2(x):
    return x**2

for i in range(10):
    print(f"x = {f1(i)}, x^2 = {f2(i)}")

yields

x = 0, x^2 = 0
x = 1, x^2 = 1
x = 2, x^2 = 2
x = 3, x^2 = 3
x = 4, x^2 = 4
x = 5, x^2 = 5
x = 6, x^2 = 6
x = 7, x^2 = 7
x = 8, x^2 = 8
x = 9, x^2 = 9

Proposed solution is: Let the decorator verify for a cached function that the function called with the first cached argument yields the same result as the cached result.

CPGGraph.to_torch() generated class does not return correct shape for constant functions

When a genome represents a constant function, e.g. f(x)=1, the torch class generated by CPGGraph does not return the correct shape. I created a test for this on the bugfix/torch_constant_output_shape branch: https://github.com/jakobj/python-gp/blob/5a50c07876adf9e8a263b4e10c0e01c5b582fb68/test/test_cgp_graph.py#L161

This can probably be fixed by modifying the CGPConstantFloat.format_output_str_torch() method.
https://github.com/jakobj/python-gp/blob/3bdff24be10ef503652b44b0be1606a639555627/gp/cgp_node.py#L191

Introduce multiplicative and additive parameters for each node

Currently the only way to introduce constants in the expressions is via the ConstantFloat or Parameter nodes. The values of the parameter nodes can be tuned by local search to specific values in order to increase the fitness of individuals.
However, there is a fundamental problem. As nodes can be used in more than one subsequent node as inputs, it can happen that a particular Parameter node feeds into two subsequent nodes. If one of them would require the parameter to be large, the other to be small, the gradient would point in opposite directions, thereby bringing the local search for this particular parameter to a halt, or at least slowing it down significantly. These can lead to difficult-to-escape local minima in the search landscape.

This can be avoided by additionally introducing parameters private to each node (e.g., Izzo et al., 2017). These private parameters are by definition not used by any other node and hence are never subject to "opposing" gradient information. For simplicity a first implementation would introduce two parameters, one that scales the output of each node and one that shifts it. By default these would have the values 1.0 and 0.0, respectively and hence not change the behavior of the node from the current implementation. I guess it could be helpful to introduce a mechanic that allows users to choose between updating only values of Parameter nodes, only private node parameters or both. As a side remark, this would allow the evolution of simple ANNs by introducing nodes that only apply an non-linearity to their input (e.g., CGPANNs; Khan et al., 2013).

Izzo, D., Biscani, F., & Mereta, A. (2017). Differentiable Genetic Programming. In European Conference on Genetic Programming (pp. 35-51). Springer, Cham.
Khan, M. M., Ahmad, A. M., Khan, G. M., & Miller, J. F. (2013). Fast Learning Neural Networks using Cartesian Genetic Programming. Neurocomputing, 121, 274-289.

Remove `CPGGRaph.to_...` methods?

The to_func, to_torch, and to_sympy methods of the CGPGraph class simply call the respective compile_... methods and do not add any functionality. I therefore suggest to remove them or rename the compile_... methods.

Use `pytest.approx` functions in tests

Some of the tests are using constructions like assert abs(1.0 - y[0]) < 1e-15. We should probably use pytest.approx in these for consistent error reporting.

Avoid random failures of `test_cgp_evolve`

sometimes, i guess depending on the seed, the test for evolve fails as it does not converge to the solution. to avoid frequently restarting travis we should fix this by adapting the parameters of the search, e.g., allow for more generations

Fix examples

Currently, the example scripts are broken due to changes in the API.

Provide `to_torch` compilation in `Individual`

The pattern

graph = gp.CartesianGraph(individual.genome)
f = graph.to_torch()

is used often enough to provide a convenience function, similar to those for to_func and to_sympy in the Individual class. This would shorten the above to

f = individual.to_torch()

Use code formatter

As discussed in #3 we should use a formatter (black) to ensure consistent code style

Move parameter value storage from `Individual` to `Genome`

The output values of Parameter nodes are currently stored in each individual (https://github.com/Happy-Algorithms-League/python-gp/blob/master/gp/individual.py#L33). I would argue this is rather surprising since they define the values of node encoded in the Genome and like it are passed on to the next generation.
I think it would hence be more natural to store these values within the Genome. This would also remove the need to pass around the dictionary from the Individual to all to... functions (e.g., https://github.com/Happy-Algorithms-League/python-gp/blob/master/gp/individual.py#L123).

Add abstract class for algorithms

In order to simplify extending the library by more evolutionary algorithms, we should create a base class for algorithms that particular algorithms inherit from.

Provide `to_numpy` compilation

As far as I can see the to_func produces a python function that can only operate on scalar values. We should provide a to_numpy function that returns a python function able to digest vectors. What do you think @mschmidt87?

Make sympy/torch optional

I think we don't require those packages for basic functionality, they should hence be marked as optional and the package should work even if they are not available on the system

Remove "CGP/cgp" prefixes where possible

We are using a lot of unnecessary prefixes, e.g., "CGPGraph", "CGPInputNode" etc. which was i guess motivated by not only having a CGPPopulation but also the BinaryPopulation class. Since we decided to remove the latter we should also go ahead and remove these prefixes to make the code more readable and avoid typing

Improve README example

the example in the README should correspond to our convention regarding the definition of parameters (four dicts, population, genome, ea, evolve) that is used in tests/examples.

Include caching example

we should add an example of an expensive fitness evaluation, e.g., just using sleep where we demonstrate how to use the caching decorator

Caching decorator doubles runtime for n_processes > 1

As shown in #86, the decorator should not be used on nested functions as the consistency check (one execution of the objective on cached input) is applied on the first call after decoration. In a multiprocessing scenario (n_processes > 1) the decorator is applied on each evaluation of the decorated function even if the function is defined globally due to the pickling/unpickling within multiprocessing thus doubling the runtime.

One way to avoid this problem is suggested in the discussion of #86:

the user can provide an optional keyword argument, e.g., check_consistency to the decorator which by default is True. if check_consistency [is true] we use the currently implemented checks that work for single processes. in addition we provide a function check_cache_consistency(<fn>, <objective>) that the user can call manually. So

  • single process: decorate with disk_cache(<fn>)
  • mulitple processes: decorate with disk_cache(<fn>, check_consistency=False) and call check_cache_consistency(<fn>, <objective>) manually once at the beginning, e.g., before evolve

`InvalidSympyExpression` just passes?

The InvalidSympyExpression exception currently just passes and does not raise any error or print any information. Is this the intended, final behavior?

Enable full install/test on Travis

The installation in travis via pip install . does not install extra requirements and hence all tests for sympy and torch are skipped. We should perform an installation with extras on travis to be able to test also those functions.

Provide mechanisms to cache python functions/pytorch classes

Currently the example for caching uses SymPy expressions. This makes sense since SymPy can simplify expressions, thereby identifying functionally equivalent individuals even when their genotypes and even phenotypes (unsimplified) differ. However one might want to avoid converting the cartesian graph to a sympy expressions and rather work with python functions or pytorch classes. In these cases caching would still make sense for individuals that only different in inactive genes. How could we provide such functionality most easily?

`CGPGraph.pretty_print` cuts off letters

for some graph pretty_print cuts of the last letter of "outputs":

00   Input              01 * Const              03   Mul                05 * Outpu (01)

i guess that's a bug.

also, this function should not have "print" in the name as it only returns a string but does not actually print anything.

Check whether silent mutations in nodes inputs are allowed

In principle each node has the same number of input genes, independent of its currently selected operation, e.g., a node with arity=1 still has 2 input genes if the max arity in the graph is 2. Does the current implementation allow the mutation of the silent input genes or are these excluded? Since silent mutations seem to be beneficial generally, we should make sure this is the case.

Investigate separation of Population class and evolutionary algorithm.

Currently, the evolutionary algorithm is a method of the Population class (in generate_new_parent and generate_offspring). We should investigate whether it is possible to disentangle those to separate the representation of genome and phenotype (CGP) from the optimization algorithm (NSGA II).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.