Giter Club home page Giter Club logo

learn_eda_for_data_science's Introduction

Learn_EDA_for_Data_Science

Univariate, Bivariate and Multi-variate Analysis

Data structure

Data Type Conversion

  • coerce will introduce NA values for non numeric data in the columns
  • if there are values that cannot be changed into numeric it will throw an error therefore the above statement

Remove Duplicates

  • Count of Duplicated Rows
  • print the duplicated rows
  • Drop Columns
  • Rename the weird columns

Outlier Detection

  • Box plot
  • Extracting Outliers
  • Fliers are Outliers
  • To get Whiskers

Descriptive Stats

Check for Balaced or Imbalanced Data in Categorical data

  • Bar Plot

Missing Values and Imputation

  • Mean Imputation

Null values Imputation for categorical data/values

  • Get the object values
  • Missing value imputation for categorical value
  • Join the data set with imputed object dataset

Scatter plot and Correlation Analysis

Transformation of Data

  • Creating Dummy Values for weather column

Normalization of the Data range(0 to 1)

  • Summarize Transform data

Standardize data (0 mean, 1 std) range(-3 sigma to +3 sigma)

Speed Up EDA Process

learn_eda_for_data_science's People

Contributors

moindalvs avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.