Giter Club home page Giter Club logo

woe's Introduction

WOE

WOE Transformation常用于信用风险评分卡(Credit Risk Scorecard)模型中,采用分箱的方式对原始特征进行非线性映射。常见的分箱方法有等宽分箱、等频分箱、最优分箱等,这里设计了一种基于决策树的分箱算法,其核心是基于iv值最大求最优分箱,可同时处理连续型变量和离散型变量。

1、连续型变量:针对一对feature-label构造决策树,选择最优分裂点时需保证左树iv+右树iv之和最大,如果二者之和大于不分裂时的iv则分裂,否则不分裂;同时需要保证每个叶子节点样本数量大于给定的最小样本数量。最终,每个父节点存储了用于分箱的分裂点信息,叶子节点存储了该分箱内的woe、iv、正负样本数量等信息;

2、离散型变量:对特征的每个离散值求woe值,用经woe值替换后的样本构造决策树,方法与处理连续型变量一致。需要注意的是在树的每一次分裂过程中,都要记录下分裂所涉及到的原始特征值。最终,每个叶子节点存储了该分箱内的原始特征值、woe、iv、正负样本数量等信息;

3、提取树结构中存储的的分裂点信息、分箱内的原始特征值、woe、iv、正负样本数量信息构成分箱规则。最终生成的分箱规则中,bin_value_list表示离散特征每个分箱对应的原始特征值;split_left表示连续特征分箱左界(>),split_right表示连续特征分箱右界(<=);iv_sum表示该特征所有分箱iv之和。

针对UCI信用卡用户违约和支付数据集credit card,对比了model builder和采用本方法得到的分箱结果,表明基于决策树的最优分箱效果超过了model builder:分箱数量合理、箱内样本数量均匀、iv值比model builder跑出来的要大。

woe's People

Contributors

zhaoxingfeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.