Coding Logistic Regression From Scratch - Lab

Introduction

In this lab, you'll practice your ability to translate mathematical algorithms into Python functions. This will deepen and solidify your understanding of logistic regression!

Objectives

In this lab you will:

Build a logistic regression model from scratch using gradient descent

Overview

Recall that the logistic regression algorithm builds upon the intuition from linear regression. In logistic regression, you start by taking the input data, X, and multiplying it by a vector of weights for each of the individual features, which produces an output, y. Afterward, you'll work on using an iterative approach via gradient descent to tune these weights.

Linear regression setup

Write a simple function predict_y() that takes in a matrix X of observations and a vector of feature weights w and outputs a vector of predictions for the various observations.

Recall that this is the sum of the product of each of the feature observations and their corresponding feature weights:

$\large \hat{y}i = X{i1} \cdot w_1 + X_{i2} \cdot w_2 + X_{i3} \cdot w_3 + ... + X_{in} \cdot w_n$

Hint: Think about which mathematical operation you've seen previously that will take a matrix (X) and multiply it by a vector of weights (w). Use NumPy!

# Your code here
import numpy as np

def predict_y(X, w): 
    pass

The sigmoid function

Recall that the sigmoid function is used to map the linear regression model output to a range of 0 to 1, satisfying basic premises of probability. As a reminder, the sigmoid function is defined by:

$S(x) = \dfrac{1}{1+e^(-x)}$

Write this as a Python function where x is the input and the function outputs the result of the sigmoid function.

Hint: Use NumPy!

# Your code here

Plot the sigmoid

For good measure, let's do a brief investigation of your new function. Plot the output of your sigmoid() function using 10,000 values evenly spaced from -20 to 20.

import matplotlib.pyplot as plt
%matplotlib inline

# Plot sigmoid

Gradient descent with the sigmoid function

Recall that gradient descent is a numerical method for finding a minimum to a cost function. In the case of logistic regression, you are looking to minimize the error between the model's predictions and the actual data labels. To do this, you first calculate an error vector based on the current model's feature weights. You then multiply the transpose of the training matrix itself by this error vector in order to obtain the gradient. Finally, you take the gradient, multiply it by the step size and add this to our current weight vector to update it. Below, write such a function. It will take 5 inputs:

X
y
max_iterations
alpha (the step size)
initial_weights

By default, have your function set the initial_weights parameter to a vector where all feature weights are set to 1.

# Your code here
def grad_desc(X, y, max_iterations, alpha, initial_weights=None):
    """Be sure to set default behavior for the initial_weights parameter."""
    # Create a for loop of iterations
        # Generate predictions using the current feature weights
        # Calculate an error vector based on these initial predictions and the correct labels
        # Calculate the gradient 
        # As we saw in the previous lab, calculating the gradient is often the most difficult task.
        # Here, your are provided with the closed form solution for the gradient of the log-loss function derived from MLE
        # For more details on the derivation, see the additional resources section below.
        gradient = np.dot(X.transpose(), error_vector) 
        # Update the weight vector take a step of alpha in direction of gradient 
    # Return finalized weights

Running your algorithm

Now that you've coded everything from the ground up, you can further investigate the convergence behavior of the gradient descent algorithm. Remember that gradient descent does not guarantee a global minimum, only a local minimum, and that small deviations in the starting point or step size can lead to different outputs.

First, run the following cell to import the data and create the predictor and target variables:

# Import data
import pandas as pd
df = pd.read_csv('heart.csv')

# Create the predictor and target variables
y = df['target']
X = df.drop(columns=['target'], axis=1)

print(y.value_counts())
X.head()

Run your algorithm and plot the successive weights of the features through iterations. Below is a dataset, with X and y predefined for you. Use your logistic regression function to train a model. As the model trains, record the iteration cycle of the gradient descent algorithm and the weights of the various features. Then, plot this data on subplots for each of the individual features. Each graph should have the iteration number on the x-axis and the value of that feature weight for that iteration cycle on the y-axis. This will visually display how the algorithm is adjusting the weights over successive iterations, and hopefully show convergence to stable weights.

# Your code here

Scikit-learn

For comparison, import scikit-learn's standard LogisticRegression() function. Initialize it with no intercept and C=1e16 or another very high number. The reason is as follows: our implementation has not used an intercept, and you have not performed any regularization such as Lasso or Ridge (scikit-learn uses l2 by default). The high value of C will essentially negate this. Also, set the random_state to 2 and use the 'liblinear' solver.

After initializing a regression object, fit it to X and y.

# Your code here

Compare the models

Compare the coefficient weights of your model to that generated by scikit-learn.

# Your code here

Level up (Optional)

Update the gradient descent algorithm to also return the cost after each iteration. Then rerun the algorithm and create a graph displaying the cost versus the iteration number.

# Your code here

Additional Resources

If you want to see more of the mathematics behind the gradient derivation above, check out section 4.4.1 from the Elements of Statistical Learning which can be found here: https://web.stanford.edu/~hastie/ElemStatLearn//.

Summary

Congratulations! You just coded logistic regression from the ground up using NumPy! With this, you should have a fairly deep understanding of logistic regression and how the algorithm works! In the upcoming labs, you'll continue to explore this from a few more angles, plotting your data along with the decision boundary for our predictions.

robillersomeone / dsc-coding-logistic-regression-from-scratch-nyc-ds-100719 Goto Github PK

dsc-coding-logistic-regression-from-scratch-nyc-ds-100719's Introduction

Coding Logistic Regression From Scratch - Lab

Introduction

Objectives

Overview

Linear regression setup

The sigmoid function

Plot the sigmoid

Gradient descent with the sigmoid function

Running your algorithm

Scikit-learn

Compare the models

Level up (Optional)

Additional Resources

Summary

dsc-coding-logistic-regression-from-scratch-nyc-ds-100719's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent