Giter Club home page Giter Club logo

credit-risk-classification's Introduction

credit-risk-classification

Overview of the Analysis

The purpose of this analysis is to be able to classify whether a loan is a "high-risk" or "healthy."

The data used in the model was the loan size, interest rate, the borrower's income, their debt to income, number of accounts, the number of derogatory marks, and finally their total debt.

We were trying to predict whether the status of the loan which would be either 0 or 1. A 0 indicated that the loan was healthy and a 1 indicated that the model was at high risk of defaulting.

A logistic regression model was used for the classification. Before the data was provided into the model we first created our X and y variables. Our y variable is the loan_status column. Our X variable was all other columns except for the loan_status column.

Afterwards, we took the data and split it into different variables to allow for testing of the model's accuracy. It was done using train_test_split with the following code:

X_train, X_valid, y_train, y_valid = train_test_split(X, y, random_state=1)

We then fit the model with the training data. Next, we used that model to predict some values using the X_valid data and analyzed its accuracy with the y_valid data.

In this section, describe the analysis you completed for the machine learning models used in this Challenge. This might include:

Results

  • Precision:

    • This measures the accuracy of positive predictions of the model.
      • On label 0, the precision was 1.00, meaning that when the model predicts that a loan is "healthy" it is correct 100% of the time.
      • On label 1, the precision was 0.85, meaning that when a model predicts a loan as "high-risk" it is correct only 85% the time.
  • Recall:

    • This measures the ability of the model to identify instances of a specific class correctly
      • On label 0, the recall was 0.99, which means that the model correctly identified 99% of the "healthy" loans.
      • On label 1, the recall was 0.91, which means that the model correctly identified 91% of the "high-risk" loans.
  • Accuracy:

    • The accuracy score of the model was 0.99, which means that the model was accurate in its predictions 99% of the time

Summary

This machine learning model performs very well with predicting whether a loan is "healthy" or at "high-risk". The model shows high precision and recall for both labels, correctly identifying "healthy" loans with a 100% precision and a 99% recall and it identifies "high-risk" loans with a 85% precision and a 91% recall. This means that whenever the model predicts the outcome as "healthy", it is very likely to be accurate and it also identifies a high portion of "high-risk" loans.

The models accuracy is 99% which means it makes a correct prediction for almost all instances. The model was also trained and evaluated on a large dataset. The evaluation was done on over 19,000 datapoints.

I would recommend using this LogisticRegression model for loan predictions. It has a high precision, recall, and accuracy while being evaluated on a large dataset.

credit-risk-classification's People

Contributors

hunterg003 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.