Adia Lumadjeng, Tabea Röber, M. Hakan Akyüz, and Ş. Ilker Birbil
We introduce a new rule-based optimization method for classification with constraints. The proposed method takes advantage of linear programming and column generation, and hence, is scalable to large datasets. Moreover, the method returns a set of rules along with their optimal weights indicating the importance of each rule for learning. Through assigning cost coefficients to the rules and introducing additional constraints, we show that one can also consider inter pretability and fairness of the results. We test the performance of the proposed method on a collection of datasets and present two case studies to elaborate its different aspects. Our results show that a good compromise between interpretability and fairness on the one side, and accuracy on the other side, can be obtained by the proposed rule-based learning method.
You can find the details in our manuscript.
This notebook illustrates how to use RUX and RUG.
-
Install Anaconda Distribution.
-
Create a new environment and install the necessary packages:
conda create -n rulediscovery -c conda-forge numpy pandas scikit-learn cvxpy cvxopt
- Activate the current environment and install
gurobi
package in the environment:
conda activate rulediscovery
conda install -c gurobi gurobi
The code contains the following files to reproduce the results of our manuscript:
-
In the jupyter notebook
RuleDiscovery.ipynb
we demonstrate how to use RUG and RUX in single fold on the ecoli dataset. The code will produce the results of RF, ADA and GB, along with RUX(RF), RUX(ADA), RUX(GB), and RUG. -
The folder
num_exp
contains all files for the numerical experiments and the case study reported in the manuscript. Please see theREADME.md
in that directory for more details.
Note that the default for solver
option is 'gurobi'. To use the Gurobi solver, you need to first install
it. The solver is freely available for academic use. Check the related page
on Gurobi's website. The current version of our code also supports the open source solver GLPK (set solver='glpk'
).