Giter Club home page Giter Club logo

credit_risk_analysis's Introduction

Credit_Risk_Analysis

You can find the Analysis file here: credit_risk_resampling.ipynb | credit_risk_ensemble.ipynb

Analysis Overview

We create and analyse multiple machine learning models to forecast credit risk using Python in this project. The following technique was used:

  • Using the RandomOverSampler and SMOTE algorithms, oversample the data.
  • Using the ClusterCentroids technique, undersample the data.
  • Using the SMOTEENN method, take a combinatorial approach to over- and undersampling.
  • BalancedRandomForestClassifier and EasyEnsembleClassifier are two machine learning models that eliminate bias.

We'll assess these models' performance and offer a recommendation on whether or not they should be utilised to predict credit risk.

Results (Balanced Accuracy Scores, Confusion Matrixes and Imbalanced Classification Reports)

RandomOverSampler model


A balanced accuracy score of 64 percent is achieved. The high risk precision is just approximately 1% with a sensitivity of 62 percent, resulting in an F1 of about 2%. Because of the large number of low-risk people, it has a precision of almost 100 percent and a sensitivity of 68 percent.

SMOTE model


The outcomes are very similar to those of the prior model. The balanced accuracy score is 63%. The high risk precision is just approximately 1% with a sensitivity of 60 percent, resulting in an F1 of about 2%. Because of the large number of low-risk individuals, it has an accuracy of almost 100% and a sensitivity of 68%.

ClusterCentroids model


Here the balanced accuracy score is down to about 51%. The high risk precision is still 1% with a sensitivity of 60 percent, resulting in an F1 of 1%. The low risk sensitivity is just 43% due to the significant amount of false positives.

SMOTEENN model


The balanced accuracy score is around 62%. The high risk precision is still 1% with a sensitivity of 70 percent, resulting in an F1 of only 2%. The low risk sensitivity is 55 percent due to the significant amount of false positives.

BalancedRandomForestClassifier model


The balanced accuracy score increased to around 79%. The high risk precision is still poor, at just 4% with only 67 percent sensitivity, resulting in an F1 of only 7%. The low risk sensitivity is now 91 percent with 100 percent presicion, thanks to a decreasing number of false positives.

EasyEnsembleClassifier model


The balanced accuracy score has now risen to over 92 percent. The high risk precision is still poor, at just 7% with 91 percent sensitivity, resulting in an F1 of of 14%. The low risk sensitivity is now 94 percent with 100 percent precision, thanks to a decreasing number of false positives.

Summary

All of the credit risk analysis models have low accuracy in assessing if a credit risk is high. The Ensemble models resulted in significant improvements, particularly in the sensitivity of high-risk loans. With a recall of 92 percent, the EasyEnsembleClassifier model can detect virtually all high-risk credit. On the other hand, because of the poor accuracy, many low-risk credits are still misclassified as high-risk, putting the bank's credit strategy at risk and causing it to miss out on income prospects. As a result, I would advise the bank against using any of these algorithms to anticipate credit risk.

Contact:

credit_risk_analysis's People

Contributors

nedaaj avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.