Giter Club home page Giter Club logo

learn-sql-for-data-analysis's Introduction

Learn SQL for Data Analysis

I am building this repository for study purposes. I am on the journey to become a Data Analyst, and I want to share what I have learned along the way.

SQL

The Structured Query Language (SQL) was first developed in 1970 by researchers at IBM. The initial version was created to manage and retrieve data from IBM's original relational databases called "System R". A few years later, the SQL language became publicly available. The American National Standards Institute (ANSI) and the International Standard Organization (ISO) then took the SQL language as the standard for communicating with relational databases. While some DBMS (Database Management Systems) have altered the language, the majority still follows the ANSI-approved version of SQL programs.

About the Content

The content of this repository covers the essential SQL commands. Some of the material is based on the book "Getting Started with SQL" by Thomas Nield, published by O'Reilly. However, the SQL code will always be written using MySQL, one of the many DMBS for relational databases available in the market. In some cases, I will highlight the differences in queries and commands between MySQL, SQL Server, and SQLite. Additionally, we will delve into SQL and some NoSQL databases for data analysis.

Note 1: All of this content is written in the form of "notes" by me. I encourage you to take some time and read the official documentation for the SQL language. I will always provide these references and others at the end of each topic.

Note 2: At the time of viewing this, I may still be in the process of developing the content and updating this repository. As a result, you may encounter some unclickable topics. The updates to these topics do not follow a specific order and are based on my needs and study progress. Follow me to stay updated on new upgrades.

Prerequisites

  1. Install MySQL Community Server. and the MySQL Worbench.
  2. Have access and know how to use terminal/command line.

Some of the Basics Commands

DQL (Data Query Language)

  • SELECT: Retrieve data from database.

DML (Data Manipulation Language)

  • INSERT: Insert data into a table.
  • DELETE: Delete data from a database table.
  • UPDATE: Update an existing data within a table.

DDL (Data Definition Language)

  • ALTER: Change the structure of the database.
  • CREATE: Create databases or objects, like tables and views.
  • DROP: Delete objects from database.
  • RENAME: This is used to rename an object existing in the database.

DCL (Data Control Language)

  • GRANT: Give privileges access to database.
  • REVOKE: Withdraws the user's access privileges given by using the GRANT command.

TCL (Transaction Control Language)

  • COMMIT: Commits a Transaction.
  • ROLLBACK: Rollbacks a transaction in case of any error occurs.
  • SAVEPOINT: Sets a save point within a transaction

Queries examples

Sample use case:

You are working for a logistics company that sends emails to customers to inform them about the status of their packages. The company has a database of all the emails sent, which includes the email ID, recipient email address, subject, body, and sending date.

  1. Create the database and its tables:
    CREATE DATABASE logistics_email;
    
    USE logistics_email;
    
    CREATE TABLE emails (
        email_id INT AUTO_INCREMENT PRIMARY KEY,
        recipient VARCHAR(255) NOT NULL,
        subject VARCHAR(255) NOT NULL,
        body TEXT NOT NULL,
        sent_date DATE NOT NULL
     ); 
  1. Insert values to those tables:
    INSERT INTO emails (recipient, subject, body, sent_date)
    VALUES
          ('[email protected]', 'Package Update', 'Your package has been shipped', '2022-01-01'),
          ('[email protected]', 'Package Delivery', 'Your package has been delivered', '2022-01-02'),
          ('[email protected]', 'Package Delay', 'Your package has been delayed', '2022-01-03'),
          ('[email protected]', 'Package Update', 'Your package is on its way', '2022-01-04'),
          ('[email protected]', 'Package Arrival', 'Your package has arrived', '2022-01-05');
  1. Update the email with the subject "Package Delivery" to "Package Delivery Update":
    UPDATE emails
    SET subject = 'Package Delivery Update'
    WHERE email_id = 2;
  1. Add a new column called "status" with "Pending" as its default value:
    ALTER TABLE emails
    ADD COLUMN status VARCHAR(20) NOT NULL DEFAULT 'Pending';
  1. Retrieve the number of emails sent to each recipient email address and the average length of the email bodies:
    SELECT 
        recipient,
        COUNT(email_id) as total_emails,
        AVG(LENGTH(body)) as avg_body_length
    FROM emails
    GROUP BY recipient;
  1. Retrieve the number of emails sent to each recipient email address and the average length of the email bodies only for the emails sent in the last week:
    SELECT 
        recipient,
        COUNT(email_id) as total_emails,
        AVG(LENGTH(body)) as avg_body_length
    FROM emails
    WHERE sent_date >= DATE_SUB(NOW(), INTERVAL 7 DAY)
    GROUP BY recipient;

Table of Contents

  1. Select, Filter, Order and Operationals
  2. Handling with Variables
  3. Joining Tables
  4. Aggregating Data for Analysis
  5. Windows Function, Subqueries and Handling with date
  6. Exploratory Data Analysis
  7. Cleaning and Processing Data
  8. Data Analysis
  9. Programming
  10. Optimizing SQL Queries

Contact me ๐Ÿ”— ๐Ÿ‘‡

EricSoares EricSoares

learn-sql-for-data-analysis's People

Contributors

soareseric avatar

Watchers

 avatar

Forkers

bhavika0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.