Comments (7)
Hi @guillermo-navas-palencia
The splits result with default option of OptimalBinning
is [[ 2., 7., 9., 3., 10., 4. 8],[-1]]
. When I parse this splits as an option into user_splits
parameter, it return the same error :).
from optbinning.
I created this example to reproduce your problem:
import numpy as np
np.random.seed(0)
n = 100000
x = sum([[i] * n for i in [-1, 2, 3, 4, 7, 8, 9, 10]], [])
y = list(np.random.binomial(1, 0.011665, n))
y += list(np.zeros(n))
y += list(np.random.binomial(1, 0.0133333, n))
y += list(np.random.binomial(1, 0.166667, n))
y += list(np.zeros(n))
y += list(np.random.binomial(1, 0.0246041, n))
y += list(np.zeros(n))
y += list(np.random.binomial(1, 0.025641, n))
user_splits = [[2., 7., 9., 3., 10., 4.], [8], [-1]]
user_splits_fixed = [True, True, True]
optb1 = OptimalBinning(dtype="categorical", user_splits=user_splits)
optb2 = OptimalBinning(dtype="categorical", user_splits=user_splits,
user_splits_fixed=user_splits_fixed)
for optb in (optb1, optb2):
optb.fit(x, y)
optb.binning_table.build()
assert optb.binning_table.iv == approx(0.09345086993827473, rel=1e-6)
After commit a6d015b, it must work as expected.
from optbinning.
Hi @nic9lif3,
Thanks, I will try to reproduce your problem.
from optbinning.
Hi @nic9lif3,
There was a bug, I just fixed it, thanks for noticing. I acknowledge some tests are missing for the categorical type. I will commit the changes to master with new unit tests in a few minutes.
from optbinning.
Thanks @guillermo-navas-palencia for your contribute.
from optbinning.
Hi, @guillermo-navas-palencia ,
It seems like you change the code of setting user split for categorical. Error unhashable type list raises when I set user split is a list of list.
from optbinning.
Hi @nic9lif3,
No, it has not changed. I just run the previous example with version 0.8.0, and it worked without issues. It would be more helpful if you could attach the error message, my impression is that this error is not OptBinning related.
from optbinning.
Related Issues (20)
- BinningProcess special_codes HOT 1
- Is it possible to change the index of special values?? HOT 1
- Plot: handling of add_special and add_missing when show_bin_label is True HOT 1
- Wrong reference feature in special_codes_y in preprocessing_2d.split_data_2d HOT 1
- Memory error/kernel restarting HOT 4
- BinningProcess: error in binning_transform_params parameter with metric = bins HOT 12
- 'ortools' version conflict HOT 6
- How to create interaction variables like we do in SAS ? HOT 1
- Feature Request : 2D Binning when one of the features is missing. HOT 1
- RuntimeWarning: invalid value encountered in cast n_zeros = np.empty(n_bins).astype(np.int64) HOT 6
- Fast, greedy solver? HOT 2
- Shapely values on Scorecard object HOT 4
- Option to force continuous target type in BinningProcess HOT 1
- python setup.py install not completing HOT 2
- MulticlassBinningTable WoE HOT 1
- BinningProcess in Pipeline and cross-validation (GridSearchCV) HOT 4
- Extract p-value information from the BinningTable object HOT 1
- Negative values can lead to Scorecard failure HOT 2
- Error: Fixed user_splits are removed because produce pure prebins HOT 2
- Keep pandas.DataFrame index in `_transform` method HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from optbinning.