rulediscovery's Introduction

Rule Generation for Classification: Scalability, Interpretability, and Fairness

Adia Lumadjeng, Tabea Röber, M. Hakan Akyüz, and Ş. Ilker Birbil

We introduce a new rule-based optimization method for classification with constraints. The proposed method takes advantage of linear programming and column generation, and hence, is scalable to large datasets. Moreover, the method returns a set of rules along with their optimal weights indicating the importance of each rule for learning. Through assigning cost coefficients to the rules and introducing additional constraints, we show that one can also consider inter pretability and fairness of the results. We test the performance of the proposed method on a collection of datasets and present two case studies to elaborate its different aspects. Our results show that a good compromise between interpretability and fairness on the one side, and accuracy on the other side, can be obtained by the proposed rule-based learning method.

You can find the details in our manuscript.

This notebook illustrates how to use RUX and RUG.

Installation

Install Anaconda Distribution.
Create a new environment and install the necessary packages:

conda create -n rulediscovery -c conda-forge numpy pandas scikit-learn cvxpy cvxopt

Activate the current environment and install gurobi package in the environment:

conda activate rulediscovery

conda install -c gurobi gurobi

Repo structure

The code contains the following files to reproduce the results of our manuscript:

In the jupyter notebook RuleDiscovery.ipynb we demonstrate how to use RUG and RUX in single fold on the ecoli dataset. The code will produce the results of RF, ADA and GB, along with RUX(RF), RUX(ADA), RUX(GB), and RUG.
The folder num_exp contains all files for the numerical experiments and the case study reported in the manuscript. Please see the README.md in that directory for more details.

Solvers

Note that the default for solver option is 'gurobi'. To use the Gurobi solver, you need to first install it. The solver is freely available for academic use. Check the related page on Gurobi's website. The current version of our code also supports the open source solver GLPK (set solver='glpk').

rulediscovery's People

Contributors

Stargazers

Watchers

rulediscovery's Issues

ruxg.py line 413

Issue:
The line self.ruleLengthPerSample = np.mean(rule_lengths) assigns the mean value of rule_lengths to the entire array self.ruleLengthPerSample. However, the intention seems to assign this mean value to a specific index within self.ruleLengthPerSample.

Proposed Correction:
self.ruleLengthPerSample[sindx] = np.mean(rule_lengths). This change ensures the mean value is placed at the intended index within the array

Recommend Projects

sibirbil / rulediscovery Goto Github PK

rulediscovery's Introduction

Rule Generation for Classification: Scalability, Interpretability, and Fairness

Installation

Repo structure

Solvers

rulediscovery's People

Contributors

Stargazers

Watchers

Forkers

rulediscovery's Issues

ruxg.py line 413

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent