Giter Club home page Giter Club logo

jianninapinto / loan-default-risk-prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.98 MB

Trained a classifier by using labeled data and oversampling and undersampling techniques to predict if a borrower will default on a loan. The model is intended to be used as a reference tool to help investors make informed decisions about lending to potential borrowers based on their ability to repay. The purpose is to lower risk & maximize profit.

Jupyter Notebook 100.00%
classification-report confusion-matrix logistic-regression one-hot-encoding python random-forest-classifier roc-auc-curve scikit-learn smote supervised-machine-learning xgboost partial-dependence-plot shapley

loan-default-risk-prediction's Introduction

Loan Default Risk Prediction

Overview

This repository contains a loan default risk prediction project aimed at assisting investors on the Lending Club platform. By using machine learning algorithms, the model predicts the probability of borrowers defaulting on their loans, enabling investors to make more informed decisions.

Project Statement

The main goal of this project is to develop a predictive model that can analyze borrower profiles and loan characteristics to assess the risk of loan default. The model's performance will be evaluated using various metrics to ensure its effectiveness.

This machine learning model is intended to be used as a reference tool to help investors make informed decisions about lending to potential borrowers based on their ability to repay. The main purpose is to lower risk and maximize profit.

Data Collection

The dataset used on this project contains more than 9,500 loans with information about the borrower profile, loan structure and whether the loan was repaid. This data was extracted from Kaggle - Loan Data.

Data dictionary

Variable Explanation
0 credit_policy 1 if the customer meets the credit underwriting criteria; 0 otherwise.
1 purpose The purpose of the loan.
2 int_rate The interest rate of the loan (more risky borrowers are assigned higher interest rates).
3 installment The monthly installments owed by the borrower if the loan is funded.
4 log_annual_inc The natural log of the self-reported annual income of the borrower.
5 dti The debt-to-income ratio of the borrower (amount of debt divided by annual income).
6 fico The FICO credit score of the borrower.
7 days_with_cr_line The number of days the borrower has had a credit line.
8 revol_bal The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle).
9 revol_util The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available).
10 inq_last_6mths The borrower's number of inquiries by creditors in the last 6 months.
11 delinq_2yrs The number of times the borrower had been 30+ days past due on a payment in the past 2 years.
12 pub_rec The borrower's number of derogatory public records.
13 not_fully_paid 1 if the loan is not fully paid; 0 otherwise.

Machine Learning Models

Three different machine learning models were trained and compared: Logistic Regression, Random Forest, and XGBoost. Each model's performance was evaluated to determine the most suitable approach for loan default risk prediction.

Model Evaluation and Performance

The models were evaluated using various evaluation metrics such as accuracy and F1-score. The model with the best performance was selected for further analysis.

Hyperparameter Tuning

To optimize the selected model's performance, hyperparameter tuning was performed using techniques like Randomized Search Cross-Validation.

Results and Insights

The final model achieved a decent F1-score in identifying potential loan defaulters, but there is still room for improvement. Insights were gained from feature importance analysis and partial dependence plots. It's important to note that predicting loan defaults is a challenging task and achieving high accuracy for both classes simultaneously can be difficult.

Authors

loan-default-risk-prediction's People

Contributors

jianninapinto avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.