Comments (6)
Thank you for your response @guillermo-navas-palencia. After executing the analysis, I have known the properties of binning_table, but I don't get it now. I'm not too good at math, so I don't know the difference between the normalized Jensen-Shannon divergence applied to js
and the Jeffrey divergence applied to iv
.
- Can you explain it or suggest some tutorial documentation explaining this difference?
- And, the main problem here is that I need
iv
to choose important fields in the dataset, but I cannot calculate, then rank them for my experiment. It raises the question "Canjs
replaceiv
in my problem? How can I prove it?"
from optbinning.
I use the Jensen Shannon for binary and multiclass. See https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence.
Yes, you can rank by JS. Both are divergence measures.
from optbinning.
Hi @juyjuyy.
Note that binning quality score for multiclass target is slightly different: https://github.com/guillermo-navas-palencia/optbinning/blob/master/optbinning/binning/metrics.py#L347. It replaces the IV with the normalized Jensen-Shannon divergence. The js
property can be retrieved from the multiclass binning table: https://github.com/guillermo-navas-palencia/optbinning/blob/master/optbinning/binning/metrics.py#L347.
from optbinning.
Can you suggest some documents related to the problem I have? @guillermo-navas-palencia
from optbinning.
Thank you for your response @guillermo-navas-palencia ,
- I wonder why both
IV
andJS
value exists in the binning table with binary classification, but the multi-class classification exists onlyJS.
Could you explain the theory for that? IsIV
not used for multi-class problem? - I can use the rule of thumb applied for IV to choose features such that IV above 0.1 is the strong predictor. If JS is the value calculated for ranking, how can I know which value I can choose for feature selection?
Please help me, I really need your reply.
from optbinning.
- The IV is a divergence measure only suitable for binary target. The JS divergence generalizes the IV allowing multiple categories (multi-class problem).
- The IV, unlike JS, is unbounded. I found, experimentally, that IV is commonly 5-10 times larger than JS for a binary target. Therefore, a value above 0.02 might work, although I suppose that depends on the number of classes. This is a problem I haven't investigated.
from optbinning.
Related Issues (20)
- Passing manual bins into BinningProcess HOT 2
- Random binning outputs generated HOT 2
- Result of default OptimalBinning is worse compare with the one that has more restrictions HOT 1
- Can optbinning really work with multi-class classification? and build scorecard model using Scorecard() function? HOT 1
- BinningProcess Behavior Mismatch with OptimalBinning for same Settings HOT 1
- Randomness in the binning : Getting Different Bins each time HOT 5
- Add multiple graph option HOT 5
- Better handling dtypes HOT 2
- Trouble with serializing binning table to JSON HOT 1
- The feature of numpy parallel can improve the efficiency of variable transform HOT 1
- Modify check on split_digits to allow for negative numbers
- Bin collapse HOT 3
- min_bin_size and max_bin_size not working when using sample_weight in ContinuousOptimalBinning
- Summary statistics could be incorrect when using
- woe of the nulls is swapped with that of the special HOT 1
- jupyter notebook kernel gets stopped when binning process is started HOT 2
- Legend missing in ScorecardMonitoring.psi_plot()
- How to resolve the optbinning issue HOT 13
- RuntimeWarning: invalid value encountered in log
- Some questions around BinningProcess HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from optbinning.