jlevy44 / interactiontransformer Goto Github PK

Extract meaningful interactions from machine learning models to obtain machine-learning performance with statistical model interpretability.

License: MIT License

Python 83.52% R 16.25% Shell 0.23%

interactiontransformer's People

Contributors

Stargazers

Watchers

Forkers

chenhz1223

interactiontransformer's Issues

Possibilities InteractionTransfomer in combination with Neural Network

Dear jlevy44,

Once again I want to thank you for being able to use your code.

I have used the code for both XGBoost and Random Forest. Now I would like to use the code in combination with Neural Network. I know that I can't use TreeExplainer for the SHAP values but I can use KernelExplainer for the SHAP values. But there is no attribute 'shap_interaction_values' for this Explainer. Do you have any suggestions on how to implement this/make it possible to use the code for NN?

Thanks in advance.

Kind regards,
Jeroen

Error in dimensions when using InteractionTransformer

Dear jlevy44,

I wanted to use the InteractionTransformer in combination with the XGBClassifier. Following your demo on GitHub, I run:

from xgboost import XGBClassifier
transformer=InteractionTransformer(untrained_model=XGBClassifier(random_state=42, tree_method='hist'),max_train_test_samples=1000,mode_interaction_extract=int(np.sqrt(X_train.shape[1])))
transformer.fit(X_train,y_train)

Where my X_train and y_train are dataframes with shape (700000,39) and (700000,1), respectively.

I get the following error:
---------------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\managers.py in create_block_manager_from_blocks(blocks, axes)
1661 blocks = [
-> 1662 make_block(values=blocks[0], placement=slice(0, len(axes[0])))
1663 ]

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\blocks.py in make_block(values, placement, klass, ndim, dtype)
2721
-> 2722 return klass(values, ndim=ndim, placement=placement)
2723

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\blocks.py in init(self, values, placement, ndim)
129 if self._validate_ndim and self.ndim and len(self.mgr_locs) != len(self.values):
--> 130 raise ValueError(
131 f"Wrong number of items passed {len(self.values)}, "

ValueError: Wrong number of items passed 1, placement implies 39

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
1 from xgboost import XGBClassifier
2 transformer=InteractionTransformer(untrained_model=XGBClassifier(random_state=42),max_train_test_samples=1000,mode_interaction_extract=int(np.sqrt(X_train.shape[1]))) # mode_interaction_extract='sqrt'
----> 3 transformer.fit(X_train,y_train)

~\InteractionTransformer.py in fit(self, X, y)
204 # import pickle
205 # pickle.dump(shap_vals,open('shap_test.pkl','wb'))
--> 206 true_top_interactions=self.get_top_interactions(shap_vals)
207 #print(true_top_interactions)
208 self.design_terms='+'.join((np.core.defchararray.add(np.vectorize(lambda x: "Q('{}')*".format(x))(true_top_interactions.iloc[:,0]),np.vectorize(lambda x: "Q('{}')".format(x))(true_top_interactions.iloc[:,1]))).tolist())

~\InteractionTransformer.py in get_top_interactions(self, shap_vals)
223
224 """
--> 225 interaction_matrix=pd.DataFrame(shap_vals.mean(0),columns=self.features,index=self.features)#reduce(lambda x,y:x+y,shap_vals)/len(shap_vals)
226 interation_matrix_self_interact_removed=interaction_matrix.copy()
227 if not self.self_interactions:

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in init(self, data, index, columns, dtype, copy)
495 mgr = init_dict({data.name: data}, index, columns, dtype=dtype)
496 else:
--> 497 mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
498
499 # For data is list-like, or Iterable (will consume into list)

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in init_ndarray(values, index, columns, dtype, copy)
232 block_values = [values]
233
--> 234 return create_block_manager_from_blocks(block_values, [columns, index])
235
236

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\managers.py in create_block_manager_from_blocks(blocks, axes)
1670 blocks = [getattr(b, "values", b) for b in blocks]
1671 tot_items = sum(b.shape[0] for b in blocks)
-> 1672 raise construction_error(tot_items, blocks[0].shape[1:], axes, e)
1673
1674

ValueError: Shape of passed values is (39, 1), indices imply (39, 39)
---------------------------------------------------------------------------------

I then tried it with the data provided in your demo and everything worked fine. Do you know what could possibly go wrong?

Thanks in advance,

Hassan

Endless run with RandomForest

Dear jlevy44,

I am very thankful for being able to use your code. However, I ran into a problem while trying to use the code for the Random Forest Regressor. If I run:
"
transformer=InteractionTransformer(RandomForestRegressor(random_state = 42),max_train_test_samples=100,mode_interaction_extract=10, cv_scoring='r2',num_workers=8,compute_interaction_dask=False,use_background_data=False)
transformer.fit(X_train,y_train)
"
Then the code is done in 7/8 minutes, and I get the results that I want. But when I add multiple parameters to the Random Forest Regressor, like for example:
"transformer=InteractionTransformer(RandomForestRegressor(random_state = 42, n_estimators = 2000, max_features = 0.2, max_depth = 50, bootstrap = True),max_train_test_samples=100,mode_interaction_extract=10, cv_scoring='r2',num_workers=8,compute_interaction_dask=False,use_background_data=False)
transformer.fit(X_train,y_train)
"
Then the code keeps running and running and it doesn't end.
I really want to see the output for the Random Forest Regressor with all the specified parameters because this model has a much better fit on my data. Do you know how to solve the problem?

Thanks in advance.

Kind regards,
Jeroen

XGBoostRegressor

Hello,
I am trying to use your transformer for XGBoostRegressor but I keep on receiving the following error : Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead.
Do you have any suggestions on how to solve this problem?

Empty data frame encountered during demo run with epistasis.test.csv

Hi,

when running the demo file on python I encountered an error in line "transformer.fit(X_train,y_train)". I uncommented the lines 204, 205, and 208 here . The array in shap_test.pkl contains only nan. The output of python and the error message is below:

Shap Interaction Size: (240, 56, 56)
Empty DataFrame
Columns: [feature_1, feature_2, shap_interaction_score]
Index: []
<pandas.core.indexing._iLocIndexer object at 0x2abb2b457770>
Traceback (most recent call last):
File "", line 1, in
File "/scratch/ducryf/int/interactiontransformer/InteractionTransformer.py", line 209, in fit
self.design_terms='+'.join((np.core.defchararray.add(np.vectorize(lambda x: "Q('{}')*".format(x))(true_top_interactions.iloc[:,0]),np.vectorize(lambda x: "Q('{}')".format(x))(true_top_interactions.iloc[:,1]))).tolist())
File "/usr/scratch/blauen/ducryf/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2091, in call
return self._vectorize_call(func=func, args=vargs)
File "/usr/scratch/blauen/ducryf/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2161, in _vectorize_call
ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
File "/usr/scratch/blauen/ducryf/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2117, in _get_ufunc_and_otypes
raise ValueError('cannot call vectorize on size 0 inputs '
ValueError: cannot call vectorize on size 0 inputs unless otypes is set

I get the same error on two systems, one using WSL (windows subsystem for linux) with Ubuntu 20.04, anaconda 4.9.2, and python 3.8.5. The other system is CentOS 7.4.1708, miniconda 4.9.2, and python 3.7.4.

How can I solve this issue?

Cheers,
Fabian

jlevy44 / interactiontransformer Goto Github PK

interactiontransformer's People

Contributors

Stargazers

Watchers

Forkers

interactiontransformer's Issues

Possibilities InteractionTransfomer in combination with Neural Network

Error in dimensions when using InteractionTransformer

Endless run with RandomForest

XGBoostRegressor

Empty data frame encountered during demo run with epistasis.test.csv

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent