Giter Club home page Giter Club logo

daily-dose-of-data-science's Introduction

View on GitHub View on πŸ”— Daily Dose of Data Science View on LinkedIn

alt text

Daily Dose of Data Science is a publication on Substack that brings together intriguing frameworks, libraries, technologies, and tips that make the life cycle of a Data Science project effortless.

This repository is a collection of all the code snippets presented in my publication. If you want to receive these tips in your mailbox daily, you can subscribe to my Substack newsletter.

Run These Code Snippets on Your Local Machine

To download the tips listed here, you can clone this repo.

git clone https://github.com/ChawlaAvi/Daily-Dose-of-Data-Science

Table of Contents

  1. Pandas
  2. Jupyter Tips
  3. Python
  4. Plotting
  5. NumPy
  6. Memory Optimization
  7. Cool Tools
  8. Run-time Optimization
  9. Sklearn
  10. Debugging
  11. Missing Data
  12. ML-AI News
  13. Machine Learning
  14. Statistics
  15. Testing
  16. Terminal
  17. Documents

Pandas

Title Notebook Substack Article
Stop Using The Describe Method in Pandas. Instead, use Summarytools. πŸ”— πŸ”—
Analyze A Pandas DataFrame Without Code πŸ”— πŸ”—
70x Faster Pandas By Changing Just One Line of Code πŸ”— πŸ”—
Reduce Memory Usage Of A Pandas DataFrame By 90% πŸ”— πŸ”— πŸ”—
Speed-up Pandas Apply 5x with NumPy πŸ”— πŸ”—
A Lesser-Known Feature of Apply Method In Pandas πŸ”— πŸ”—
Create Pandas DataFrame from Dataclass πŸ”— πŸ”—
Run SQL in Jupyter To Analyze A Pandas DataFrame πŸ”— πŸ”—
When You Should Not Use the head() Method In Pandas πŸ”— πŸ”—
Three Lesser-known Tips For Reading a CSV File Using Pandas πŸ”— πŸ”—
The Best File Format To Store A Pandas DataFrame πŸ”— πŸ”— πŸ”—
Lesser-Known Feature of the Merge Method in Pandas πŸ”— πŸ”—
The Best Way to Use Apply() in Pandas πŸ”— πŸ”—
A No-code Tool To Understand Your Data Quickly πŸ”— πŸ”—
Display Progress Bar With Apply() in Pandas πŸ”— πŸ”—
Supercharge value_counts() Method in Pandas With Sidetable πŸ”— πŸ”—
Explore CSV Data Right From The Terminal πŸ”— πŸ”—
Define the Correct DataType for Categorical Columns πŸ”— πŸ”— πŸ”—
Don't Create Conditional Columns in Pandas with Apply πŸ”— πŸ”—
Write Your Own Flavor Of Pandas πŸ”— πŸ”—
Create DataFrame Hassle-free By Using Clipboard πŸ”— πŸ”—
Alter the Datatype of Multiple Columns at Once πŸ”— πŸ”—
Why you should not dump DataFrames to a CSV πŸ”— πŸ”— πŸ”—
Why You Should Not Read CSVs with Pandas πŸ”— πŸ”— πŸ”—
Parallelize Pandas Apply() With Swifter πŸ”— πŸ”—
A Hidden Feature of Describe Method In Pandas πŸ”— πŸ”—
Enrich Your Notebook With Interactive Controls πŸ”— πŸ”—
Data Analysis Using No-Code Pandas In Jupyter πŸ”— πŸ”—
Create Pivot Tables, Aggregations and Plots Without Any Code πŸ”— πŸ”— πŸ”—
Parallelize Pandas with Pandarallel πŸ”— πŸ”— πŸ”—
Pretty Plotting With Pandas πŸ”— πŸ”—
How to Read Multiple CSV Files Efficiently πŸ”— πŸ”— πŸ”—
Configure Sklearn To Output Pandas DataFrame πŸ”— πŸ”—
Datatype For Handling Missing Valued Columns in Pandas πŸ”— πŸ”— πŸ”—
Vectorization Does Not Always Guarantee Better Performance πŸ”— πŸ”—

Jupyter Tips

Title Notebook Substack Article
Never Search Jupyter Notebooks Manually Again To Find Your Code πŸ”— πŸ”—
Stop Previewing Raw DataFrames. Instead, Use DataTables πŸ”— πŸ”—
Label Your Data With The Click Of A Button πŸ”— πŸ”—
The Coolest Jupyter Notebook Hack πŸ”— πŸ”—
View Documentation in Jupyter Notebook πŸ”— πŸ”—
Get Notified When Jupyter Cell Has Executed πŸ”— πŸ”—
Clear Cell Output In Jupyter Notebook During Run-time πŸ”— πŸ”—
CodeSquire: The AI Coding Assistant You Should Use Over GitHub Copilot πŸ”— πŸ”—
Find Your Code Hiding In Some Jupyter Notebook With Ease πŸ”— πŸ”—
Enrich Your Notebook With Interactive Controls πŸ”— πŸ”—
Data Analysis Using No-Code Pandas In Jupyter πŸ”— πŸ”—
Create Pivot Tables, Aggregations and Plots Without Any Code πŸ”— πŸ”— πŸ”—
Restart Notebook Without Losing Variables πŸ”— πŸ”— πŸ”—
Retrieve Previously Computed Output In Jupyter Notebook πŸ”— πŸ”— πŸ”—
Transfer Variables Between Jupyter Notebooks πŸ”— πŸ”— πŸ”—

Python

Title Notebook Substack Article
F-strings Are Much More Versatile Than You Think πŸ”— πŸ”—
A Single Line That Will Make Your Python Code Faster πŸ”— πŸ”—
Make Dot Notation More Powerful in Python πŸ”— πŸ”—
An Elegant Way To Perform Shutdown Tasks in Python πŸ”— πŸ”—
What Are Class Methods and When To Use Them? πŸ”— πŸ”—
Hide Attributes While Printing A Dataclass Object πŸ”— πŸ”—
List : Tuple :: Set : ? πŸ”— πŸ”—
Post_init: Add Attributes To A Dataclass Post Initialization πŸ”— πŸ”—
Simplify Your Functions With Partial Functions πŸ”— πŸ”—
DotMap: A Better Alternative to Python Dictionary πŸ”— πŸ”—
Prevent Wild Imports With all in Python πŸ”— πŸ”—
Performance Comparison of Python 3.11 and Python 3.10 πŸ”— πŸ”—
Why 256 is 256 But 257 is not 257? πŸ”— πŸ”—
Make a Class Object Behave Like a Function πŸ”— πŸ”—
Lesser-known Feature of Pickle Files πŸ”— πŸ”—
Specify Loops and Runs In %%timeit πŸ”— πŸ”—
Don't Use time.time() To Measure Execution Time πŸ”— πŸ”—
Import Your Python Package as a Module πŸ”— πŸ”—
Fine-grained Error Tracking With Python 3.11 πŸ”— πŸ”—
Run Python Project Directory As A Script πŸ”— πŸ”—
Use Slotted Class To Improve Your Python Code πŸ”— πŸ”—
Using Dictionaries In Place of If-conditions πŸ”— πŸ”—
In Defense of Match-case Statements in Python πŸ”— πŸ”—

Plotting

Title Notebook Substack Article
Simple One-Liners to Preview a Decision Tree Using Sklearn πŸ”— πŸ”—
Create Data Plots Right From The Terminal πŸ”— πŸ”—
Make Your Matplotlib Plots More Professional πŸ”— πŸ”—
Perfplot: Measure, Visualize and Compare Run-time With Ease πŸ”— πŸ”—
Prettify Word Clouds In Python πŸ”— πŸ”—
Calendar Map As A Richer Alternative to Line Plot πŸ”— πŸ”—
Density Plot As A Richer Alternative to Scatter Plot πŸ”— πŸ”— πŸ”—
Python One-Liner To Create Sketchy Hand-drawn Plots πŸ”— πŸ”—
Create a Moving Bubbles Chart in Python πŸ”— πŸ”—
Visualizing Google Search Trends of 2022 using Python πŸ”— πŸ”—
Create A Racing Bar Chart In Python πŸ”— πŸ”—
Elegantly Plot the Decision Boundary of a Classifier πŸ”— πŸ”—
Dot Plot: A Potential Alternative to Bar Plot πŸ”— πŸ”— πŸ”—
Hexbin Plots As A Richer Alternative to Scatter Plots πŸ”— πŸ”— πŸ”—
Enrich Your Notebook With Interactive Controls πŸ”— πŸ”—
Regression Plot Made Easy with Plotly πŸ”— πŸ”—
Pretty Plotting With Pandas πŸ”— πŸ”—
Polynomial Linear Regression Plot Made Easy With Seaborn πŸ”— πŸ”—
Analyse Flow Data With Sankey Diagrams πŸ”— πŸ”—
Waterfall Charts: A Better Alternative to Line/Bar Plot πŸ”— πŸ”— πŸ”—

NumPy

Title Notebook Substack Article
Speed-up NumPy 20x with Numexpr πŸ”— πŸ”—
An Elegant Way To Perform Matrix Multiplication πŸ”— πŸ”—
Difference Between Dot and Matmul in NumPy πŸ”— πŸ”—
Don't Print NumPy Arrays! Use Lovely-NumPy Instead πŸ”— πŸ”—
Polynomial Linear Regression with NumPy πŸ”— πŸ”—

Memory Optimization

Title Notebook Substack Article
70x Faster Pandas By Changing Just One Line of Code πŸ”— πŸ”—
Reduce Memory Usage Of A Pandas DataFrame By 90% πŸ”— πŸ”— πŸ”—
The Best File Format To Store A Pandas DataFrame πŸ”— πŸ”— πŸ”—
Define the Correct DataType for Categorical Columns πŸ”— πŸ”— πŸ”—
Datatype For Handling Missing Valued Columns in Pandas πŸ”— πŸ”— πŸ”—
Save Memory with Python Generators πŸ”— πŸ”—

Cool Tools

Title Notebook Substack Article
Preview Your README File Locally In GitHub Style πŸ”— πŸ”—
This GUI Tool Can Possibly Save You Hours Of Manual Work πŸ”— πŸ”—
Stop Previewing Raw DataFrames. Instead, Use DataTables. πŸ”— πŸ”—
Converting Python To LaTeX Has Possibly Never Been So Simple πŸ”— πŸ”—
Label Your Data With The Click Of A Button πŸ”— πŸ”—
Analyze A Pandas DataFrame Without Code πŸ”— πŸ”—
A No-Code Online Tool To Explore and Understand Neural Networks πŸ”— πŸ”—
Speed-up NumPy 20x with Numexpr πŸ”— πŸ”—
Debugging Made Easy With PySnooper πŸ”— πŸ”—
Deep Learning Network Debugging Made Easy πŸ”— πŸ”—
CodeSquire: The AI Coding Assistant You Should Use Over GitHub Copilot πŸ”— πŸ”—
Find Unused Python Code With Ease πŸ”— πŸ”—
Enrich Your Notebook With Interactive Controls πŸ”— πŸ”—
Data Analysis Using No-Code Pandas In Jupyter πŸ”— πŸ”—
Modify Python Code During Run-Time πŸ”— πŸ”— πŸ”—
Modify Function During Run-Time πŸ”— πŸ”— πŸ”—
Importing Modules Made Easy with Pyforest πŸ”— πŸ”—
Create Pivot Tables, Aggregations and Plots Without Any Code πŸ”— πŸ”— πŸ”—

Run-time Optimization

Title Notebook Substack Article
A Single Line That Will Make Your Python Code Faster πŸ”— πŸ”—
Make Sklearn KMeans 20x times faster πŸ”— πŸ”—
Speed-up NumPy 20x with Numexpr πŸ”— πŸ”—
The Best File Format To Store A Pandas DataFrame πŸ”— πŸ”— πŸ”—
The Best Way to Use Apply() in Pandas πŸ”— πŸ”—
Don't Create Conditional Columns in Pandas with Apply πŸ”— πŸ”—
Why you should not dump DataFrames to a CSV πŸ”— πŸ”— πŸ”—
Parallelize Pandas Apply() With Swifter πŸ”— πŸ”—
Parallelize Pandas with Pandarallel πŸ”— πŸ”— πŸ”—
How to Read Multiple CSV Files Efficiently πŸ”— πŸ”— πŸ”—

Sklearn

Title Notebook Substack Article
Sklearn One-liner to Generate Synthetic Data πŸ”— πŸ”—
Skorch: Use Scikit-learn API on PyTorch Models πŸ”— πŸ”—
Make Sklearn KMeans 20x times faster πŸ”— πŸ”—
Build Baseline Models Effortlessly With Sklearn πŸ”— πŸ”—
Polynomial Linear Regression with NumPy πŸ”— πŸ”—
An Elegant Way to Import Metrics From Sklearn πŸ”— πŸ”—
Feature Tracking Made Simple In Sklearn Transformers πŸ”— πŸ”—
Configure Sklearn To Output Pandas DataFrame πŸ”— πŸ”—

Debugging

Title Notebook Substack Article
Debugging Made Easy With PySnooper πŸ”— πŸ”—
Don't use print() to debug your code. πŸ”— πŸ”— πŸ”—
Inspect Program Flow with IceCream πŸ”— πŸ”— πŸ”—
Lesser-known Feature of f-strings in Python πŸ”— πŸ”—

Missing Data

Title Notebook Substack Article
Handle Missing Data With Missingno πŸ”— πŸ”—
Datatype For Handling Missing Valued Columns in Pandas πŸ”— πŸ”—

ML-AI News

Title Notebook Substack Article
Now You Can Use DALLΒ·E With OpenAI API πŸ”— πŸ”—

Machine Learning

Title Notebook Substack Article
Is This The Best Animated Guide To KMeans Ever? πŸ”— πŸ”—
An Effective Yet Underrated Technique To Improve Model Performance πŸ”— πŸ”—
How to Encode Categorical Features With Many Categories? πŸ”— πŸ”—
Why KMeans May Not Be The Apt Clustering Algorithm Always πŸ”— πŸ”—
Skorch: Use Scikit-learn API on PyTorch Models πŸ”— πŸ”—
A No-Code Online Tool To Explore and Understand Neural Networks πŸ”— πŸ”—
Make Sklearn KMeans 20x times faster πŸ”— πŸ”—
Deep Learning Network Debugging Made Easy πŸ”— πŸ”—
Build Baseline Models Effortlessly With Sklearn πŸ”— πŸ”—
Polynomial Linear Regression with NumPy πŸ”— πŸ”—

Statistics

Title Notebook Substack Article
Pandas and NumPy Return Different Values for Standard Deviation. Why? πŸ”— πŸ”—
Why Correlation (and Other Statistics) Can Be Misleading πŸ”— πŸ”—

Testing

Title Notebook Substack Article
Generate Your Own Fake Data In Seconds πŸ”— πŸ”—

Terminal

Title Notebook Substack Article
Create Data Plots Right From The Terminal πŸ”— πŸ”—
Visualize Commit History of Git Repo With Beautiful Animations πŸ”— πŸ”—
How Would You Identify Fuzzy Duplicates In A Data With Million Records? πŸ”— πŸ”—
Automated Code Refactoring With Sourcery πŸ”— πŸ”— πŸ”—
Explore CSV Data Right From The Terminal πŸ”— πŸ”—

Documents

Title Document Substack Article
37 Hidden Python Libraries That Are Absolute Gems πŸ”— πŸ”—
10 Automated EDA Tools That Will Save You Hours Of (Tedious) Work πŸ”— πŸ”—
30 Python Libraries to (Hugely) Boost Your Data Science Productivity πŸ”— πŸ”—

daily-dose-of-data-science's People

Contributors

chawlaavi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.