Giter Club home page Giter Club logo

wids-uid66's Introduction

Precipitation prediction using Machine Learning

The project aims to teach you the basics of machine learning and how to use it to create a machine learning model that can predict precipitation. This project is aimed at students who want to begin their career and are looking for a starting point. We will start right from the basics and move towards advanced concepts.

Problem Statement

To create a model that can predict whether precipitation occurs or not using various machine learning algorithms.

Project Phases

DOUBT SHEET

This week will be lighter for those already familiar with Python. For those who are new, we have got you covered.

Assignment: Test your Python, Numpy, Pandas and Matplotlib concepts

This week, we will review the key concepts covered in week 2 and delve into coding for the machine learning algorithms introduced during that time and begin with our final project.

Let's begin with our final project. We will proceed step by step, and resources for implementation will be provided.

  • Dataset: Download the dataset to begin with your Final Project, For downloading individual files form GitHub use "GitZip for GitHub" Chrome extension
  • Precipitation(PRCP) column in the data frame will be our target feature in this model. Replace all values greater than 0 as 1 (representing precipitation will occur), and values that are equal to 0 representing precipitation will not occur
  • Dropping null values : Drop any column that has an excessive number of null values. For the remaining columns with a lower number of null values, replace those null values with the mode of that column.
  • EDA : Perform EDA to visualize data and identify outliers
  • Data Preprocessing : Remove outliers and find correlation matrix
  • Use SMOTE to handel class imbalance: Most of the ML algorithms used for classification were designed with the assumption of an equal no. of examples in each case. Therefore, we need to balance it. The imbalance has to be removed or reduced.
  • Check for null values once again and proceed
  • Feature selection: Feature selection will be made using the chi-square test, refer SelectKBest and chi2
  • Normalise the dataset
  • Training model using different techniques
    • Split data into test and train datasets.
    • Use logistic regression classifier, decision tree classifier, neural networks training dataset.
    • Calculate accuracy, precision, recall, F-1 score, and ROC_AUC on the test dataset and visualize it.
    • Plot confusion matrix using sklearn.
    • Kindly refer to the documentation provided on Google to perform the above steps
  • Model Comparison: Compare models based on accuracy and ROC_AUC score and visualize it using seaborn

wids-uid66's People

Contributors

hrithikm86 avatar rennamahcus avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.