Giter Club home page Giter Club logo

data-science-portfolio's Introduction

Data Science Portfolio

This repository contains portfolio of data science projects for the purpose of self learning and hobby.
Presented in the form of iPython Notebooks, and R files. IPython Notebooks are made using google colab.

For better visualization of portfolio please visit the website abshkpskr.github.io
Most of the visulizations are in plotly which are not visible on github so please visit the website.

Contents

  • Data Analysis and Visualization

    • COVID-19 pandemic Global Analysis: This project contains a detailed study of spreading of coronavirus disease. Plotly and matplotlib, python visualization libraries are used to indicate the spread of virus in different countries. The spread form country to counry is indicated using plotly choropleth (world map) timeline animation. By this analysis we can see which countries are able to flatten the curve and are able to successfully contain the virus. Different metrics like mortality rate and spread rate are calculated and plotted using effective visualization techniques.

    • COVID-19 pandemic INDIA Analysis: A separate report on the ongoing spread of virus in India. It covers the spread of virus in different states of India and what are the factors that effect the spread. A deep analysis is done using vizualizations which includes datetime graph and district level map plotting. Also an overview of decisions taken by our government and the outcome of those decisions.

    Tools: Pandas, matplotlib, plotly, scipy

  • Machine Learning

    • Python

      • Predicting Number of Rental Bikes: A model to predict the number of bikes going on rent using previous two years data. Data includes features like season, weather condition, humidity, windspeed etc, an analysis on these factors is done to indicate the relation with the outcome. This helps company to maintain the inventory, required in different situations and assist to maximize the profit. It is a supervised machine learning regression problem as we have to predict the number of bikes which is a continuous feature.

      • Predicting Employee Absenteeism: This project targets the study of underlying factors that causes employees absenteeism. Factors like reason for absence, chronic diseases, money spend in commute, distance between office and home, number of children etc are used to predict the absenteeism behaviour of an employee. Although the target feature in this problem is a continous variable but binning technique is used to convert it to a supervised machine learning classification problem. This study helps the organization to better manage the man power and aids in improving the hiring process.

      Tools: Pandas, matplotlib, plotly, scipy

    • R language

      • Predicting Number of Rental Bikes: This is the same project as above, implemented here using R language. Both these projects have step wise analysis which includes data preparation, exploratory data analysis, outlier analysis. For prediction three algorithms are compared, Linear Regession, Decicion Tree and Random Forest.

      • Predicting Employee Absenteeism: The employee absenteeism project implemented in R language. Here data analysis steps include data preparation, visualization using ggplot, missing value analysis, outlier analysis, feature selection and scaling. Here monthly loss for company is also calulated.

      Tools: caret, ggplot2, Hmisc, PerformanceAnalytics, caTools, randomForest, e1071

  • Timeline report

    • CoronaVirus Timeline: This is timeline report of coronavirus worldwide spread. This report contains the series of events occured like the official news from China, reaction of different countries, decisions taken by authorities of different countries, public reaction, lockdown dates etc. Links, press releases, official documnets, news videos these all are embeded in this report to tell a story that how the world tackled the coronavirus pandemic.

If you liked what you saw, want to have a chat with me about the portfolio, work opportunities, or collaboration, shoot an email at [email protected]

data-science-portfolio's People

Contributors

abshkpskr avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

purpjudy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.