Giter Club home page Giter Club logo

genesim's Introduction

GENESIM: GENetic Extraction of a Single, Interpretable Model

This repository contains an innovative algorithm that constructs an ensemble using well-known decision tree induction algorithms such as CART, C4.5, QUEST and GUIDE combined with bagging and boosting. Then, this ensemble is converted to a single, interpretable decision tree in a genetic fashion. For a certain number of iterations, random pairs of decision trees are merged together by first converting them to sets of k-dimensional hyperplanes and then calculating the intersection of these two sets (a classic problem from computational geometry). Moreover, in each iteration, an individual is mutated with a certain probabibility. After these iterations, the accuracy on a validation set is measured for each of the decision trees in the population and the one with the highest accuracy (and lowest number of nodes in case of a tie) is returned. Example.py has run code for all implemented algorithms and returns their average predictive performance, computational complexity and model complexity on a number of dataset

Dependencies

An install.sh script is provided that will install all required dependencies

Documentation

A nicely looking documentation page is available in the doc/ directory. Download the complete directory and open index.html

Decision Tree Induction Algorithm Wrappers

A wrapper is written around Orange C4.5, sklearn CART, GUIDE and QUEST. The returned object is a Decision Tree, which can be found in decisiontree.py. Moreover, different methods are available on this decision tree: classify new, unknown samples; visualise the tree; export it to string, JSON and DOT; etc.

Ensemble Technique Wrappers

A wrapper is written around the well-known state-of-the-art ensemble techniques XGBoost and Random Forests

Similar techniques

A wrapper written around the R package inTrees and an implementation of ISM can be found in the constructors package.

New dataset

A new dataset can easily be plugged in into the benchmark. For this, a load_dataset() function must be written in load_datasets.py

Contact

You can contact me at givdwiel.vandewiele at ugent.be for any questions, proposals or if you wish to contribute.

Referring

Please refer to my work when you use it. A reference to this github or to the following (yet unpublished) paper:

@article{vandewiele2016genesim, title={GENESIM: genetic extraction of a single, interpretable model}, author={Vandewiele, Gilles and Janssens, Olivier and Ongenae, Femke and De Turck, Filip and Van Hoecke, Sofie}, journal={arXiv preprint arXiv:1611.05722}, year={2016} }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.