- docker >= 17.12.0+
- docker-compose
- Clone or download this repository
- Go inside of directory,
cd compose-postgres
- Run this command
docker-compose up -d
This Compose file contains the following environment variables:
POSTGRES_USER
the default value is postgresPOSTGRES_PASSWORD
the default value is changemePGADMIN_PORT
the default value is 5050PGADMIN_DEFAULT_EMAIL
the default value is [email protected]PGADMIN_DEFAULT_PASSWORD
the default value is admin
localhost:5432
- Username: postgres (as a default)
- Password: changeme (as a default)
- URL:
http://localhost:5050
- Username: [email protected] (as a default)
- Password: admin (as a default)
- Host name/address
postgres
- Port
5432
- Username as
POSTGRES_USER
, by default:postgres
- Password as
POSTGRES_PASSWORD
, by defaultchangeme
Connect to Postgres in Docker Container To enter a Postgres container, you need to execute using the container name and enable psql, the command-line interface for Postgres.
Connecting to
docker exec -it postgres_container psql -U postgres
Commands that are used for defining database schema such as CREATE
, DROP
, ALTER
, etc.
Commands that are used for data presentation such as SELECT
, INSERT
, UPDATE
, DELETE
, etc.
Commands that are used for permission controls such as GRANT
, INVOKE
, etc.
Commands that are used for controlling transactions such as COMMIT
, ROLLBACK
, SAVEPOINT
, etc.
SELECT { column_name | expression } AS alias_name
FROM table;
for example:
SELECT
first_name || ' ' || last_name AS full_name
FROM
customer
ORDER BY
full_name;
SELECT
column_list
FROM
table_name AS alias_name;
The practical uses of the table alias are when you query data from multiple tables that have the same column names. In this case, you must qualify the columns using the table names as follows:
SELECT t1.column_name,
t2.column_name
FROM table_name1 t1
INNER JOIN table_name2 t2 ON join_predicate;
Second, when you join a table to itself a.k.a self-join, you must use table aliases. Because PostgreSQL does not allow you to reference the same table multiple times within a query.
SELECT
colum_list
FROM
table_name table_alias
INNER JOIN table_name ON join_predicate;
QUERY EVALUATION ORDER
- First, the product of all tables in the
FROM
clause is formed. - The
WHERE
clause is then evaluated to eliminate rows that do not satisfy thesearch_condition
. - Next, the rows are grouped using the columns in the
GROUP BY
clause. - Then, groups that do not satisfy the
search_condition
in theHAVING
clause are eliminated. - Next, the expressions in the
SELECT
statement target list are evaluated. - If the
DISTINCT
keyword in present in the select clause, duplicate rows are now eliminated. - The
UNION
is taken after eachsub-select
is evaluated. - Finally, the resulting rows are sorted according to the columns specified in the
ORDER BY
clause. TOP
clause is executed.
SELECT
select_list
FROM
table_name;
- Select distinct rows using
DISTINCT
operator. - Sort rows using
ORDER BY
clause. - Filter rows using
WHERE
clause. - Select a subset of rows from a table using
LIMIT
or -FETCH
clause. - Group rows into groups using
GROUP BY
clause. - Filter groups using
HAVING
clause. - Join with other tables using joins such as
INNER JOIN
, -LEFT JOIN
,FULL OUTER JOIN
,CROSS JOIN clauses
. - Perform set operations using UNION, INTERSECT, and EXCEPT.
SELECT
column_1,
column_2
FROM
table_name
ORDER BY
column_1 [ASC | DESC],
column_2 [ASC | DESC];
The DISTINCT
clause keeps one row for each group of duplicates.
SELECT
DISTINCT column_1
FROM
table_name;
In this statement, the values in the column_1 column are used to evaluate the duplicate.
SELECT
DISTINCT column_1, column_2
FROM
table_name;
In this case, the unique combination of values in both column_1 and column_2 columns will be used for evaluating the duplicate.
The DISTINCT ON
(expression) to keep the “first” row of each group of duplicates using
SELECT
DISTINCT ON (column_1) column_alias,
column_2
FROM
table_name
ORDER BY
column_1,
column_2;
The order of rows returned from the SELECT statement is unpredictable therefore the “first” row of each group of the duplicate is also unpredictable. It is good practice to always use the ORDER BY clause with the DISTINCT ON(expression) to make the result set obvious.
SELECT select_list
FROM table_name
WHERE condition;
The following table illustrates the standard comparison operators.
Operator | Description | Example |
---|---|---|
= | Equal | WHERE first_name = 'Jamie'; |
> | Greater than | |
< | Less than | |
>= | Greater than or equal | |
<= | Less than or equal | |
<> or != | Not equal | WHERE last_name <> 'Motley'; |
AND | Logical operator AND | WHERE first_name = 'Jamie' AND last_name = 'Rice'; |
OR | Logical operator OR | WHERE first_name = 'Jamie' OR last_name = 'Rice'; |
IN | In a list of data | WHERE first_name IN ('Ann','Anne','Annie'); |
LIKE/ILIKE | Matches a pattern | WHERE first_name LIKE 'Ann%'; |
BETWEEN | Is in a range of values | WHERE LENGTH(first_name) BETWEEN 3 AND 5; |
Use the LIMIT clause to get the number of highest or lowest items in a table
SELECT
*
FROM
table_name
LIMIT n;
In case you want to skip a number of rows before returning the n rows, you use OFFSET clause placed after the LIMIT clause as the following statement:
If you use a large OFFSET, it might not be efficient because PostgreSQL still has to calculate the rows skipped by the OFFSET inside the database server, even though the skipped rows are not returned.
SELECT
*
FROM
table
LIMIT n OFFSET m;
OFFSET start_row_count { ROW | ROWS }
FETCH { FIRST | NEXT } [ row_count ] { ROW | ROWS } ONLY
value { NOT IN | IN } (value1,value2,...)
or
value { NOT IN | IN } (SELECT value FROM tbl_name);
You can rewrite IN operator by using the equal (=) and OR operators; You can also rewrite the NOT IN operator by using the not equal (<>) and the AND operators
value { BETWEEN | NOT BETWEEN } low AND high;
for numbers, dates, etc.
string LIKE pattern
ILIKE
means case insensitive LIKE
You can combine the percent ( %) with underscore ( _) to construct
value { IS NULL | IS NOT NULL }
PostgreSQL join is used to combine columns from one (self-join) or more tables based on the values of the common columns between the tables. The common columns are typically the primary key columns of the first table and foreign key columns of the second table.
PostgreSQL supports inner join
, left join
, right join
, full outer join
, cross join
, natural join
, and a special kind of join called self-join
.
SELECT
A.pka,
A.c1,
B.pkb,
B.c2
FROM
A
INNER JOIN B ON A .pka = B.fka;
- First, you specify the column in both tables from which you want to select data in the SELECT clause.
- Second, you specify the main table i.e., A in the FROM clause.
- Third, you specify the table that the main table joins to i.e., B in the INNER JOIN clause. In addition, you put a join condition after the ON keyword i.e, A.pka = B.fka.
The primary key column ( pka) and foreign key column ( fka) are typically indexed; therefore, PostgreSQL only has to check for the match in the indexes, which is very fast.
Each row in the A table may have zero or many corresponding rows in the B table. Each row in the B table has one and only one corresponding row in the A table.
If you want to select rows from the A table that have corresponding rows in the B table, you use the INNER JOIN clause.
If you want to select rows from the A table which may or may not have corresponding rows in the B table, you use the LEFT JOIN clause. In case, there is no matching row in the B table, the values of the columns in the B table are substituted by the NULL values.
SELECT
A.pka,
A.c1,
B.pkb,
B.c2
FROM
A
LEFT JOIN B ON A .pka = B.fka;
- Specify the columns from which you want to select data in the SELECT clause.
- specify the left table i.e., A table where you want to get all rows, in the FROM clause.
- Specify the right table i.e., B table in the LEFT JOIN clause. In addition, specify the condition for joining two tables.
The LEFT JOIN clause returns all rows in the left table ( A) that are combined with rows in the right table ( B) even though there is no corresponding rows in the right table ( B).
The right join
or right outer join
is a reversed version of the left join. It produces a result set that contains all rows from the right
table with matching rows from the left
table. If there is no match, the left side will contain null values.
SELECT
a.id id_a,
a.fruit fruit_a,
b.id id_b,
b.fruit fruit_b
FROM
basket_a a
RIGHT JOIN basket_b b ON a.fruit = b.fruit;
A self-join is a query in which a table is joined to itself. Self-joins are useful for comparing values in a column of rows within the same table.
SELECT column_list
FROM A a1
INNER JOIN A b1 ON join_predicate;
CREATE TABLE employee (
employee_id INT PRIMARY KEY,
first_name VARCHAR (255) NOT NULL,
last_name VARCHAR (255) NOT NULL,
manager_id INT,
FOREIGN KEY (manager_id)
REFERENCES employee (employee_id)
ON DELETE CASCADE
);
INSERT INTO employee (
employee_id,
first_name,
last_name,
manager_id
)
VALUES
(1, 'Windy', 'Hays', NULL),
(2, 'Ava', 'Christensen', 1),
(3, 'Hassan', 'Conner', 1),
(4, 'Anna', 'Reeves', 2),
(5, 'Sau', 'Norman', 2),
(6, 'Kelsie', 'Hays', 3),
(7, 'Tory', 'Goff', 3),
(8, 'Salley', 'Lester', 3);
SELECT
e.first_name || ' ' || e.last_name employee,
m .first_name || ' ' || m .last_name manager
FROM
employee e
INNER JOIN employee m ON m .employee_id = e.manager_id
ORDER BY
manager;
To include the top manager in the result set, you use the
LEFT JOIN
instead ofINNER JOIN
clause . The top manager did not appear on the output.
For example following query will return films with same length.
SELECT
f1.title,
f2.title,
f1.length
FROM
film f1
INNER JOIN film f2 ON f1.film_id <> f2.film_id
AND f1.length = f2.length;
The full outer join
combines the results of both left join
and right join
.
SELECT * FROM A
FULL [OUTER] JOIN B on A.id = B.id;
Perform the CROSS JOIN
of two tables T1
and T2
. For every row from T1
and T2
the result set will contain a row that consists of all columns in the T1
table followed by all columns in the T2
table.
SELECT *
FROM T1
CROSS JOIN T2;
It's equal to
SELECT *
FROM T1
INNER JOIN T2 ON TRUE;
or
SELECT *
FROM T1, T2;
A natural join
is a join that creates an implicit join based on the same column names in the joined tables.
To use NATURAL JOIN
table_name
should be included in id
like table_name_id
.
NATURAL JOIN
removes duplicate fields from result.
SELECT *
FROM T1
NATURAL [INNER (default), LEFT, RIGHT] JOIN T2;
If you use the asterisk (*) in the select list, the result will contain the following columns:
- All the common columns, which are the columns in the both tables that have the same name
- Every column in the first and second tables that is not a common column
NATURAL [INNER, LEFT, RIGHT] JOIN table_name;
is equal to[INNER, LEFT, RIGHT] JOIN table_name USING (table_name_id);
However, you should avoid using the NATURAL JOIN whenever possible because sometimes it may cause an unexpected result.
The GROUP BY
clause divides the rows returned from the SELECT
statement into groups. For each group, you can apply an aggregate function
e.g., SUM()
to calculate the sum of items or COUNT()
to get the number of items in the groups.
aggregate function
created based onGROUP BY
.
In this case, the GROUP BY works like the DISTINCT
clause that removes duplicate rows from the result set.
The GROUP BY
clause is useful when it is used in conjunction with an aggregate function
. For example, to get the amount that a customer
has been paid, you use the GROUP BY
clause to divide the payment
table into groups; for each group, you calculate the total
amounts using the SUM()
function:
SELECT
customer_id,
other_column
SUM (amount)
FROM
payment
GROUP BY
customer_id,
other_column;
some useful aggregate functions are
SUM()
,COUNT()
We often use the HAVING
clause in conjunction with the GROUP BY clause to filter group rows that do not satisfy a specified condition.
SELECT
column_1,
aggregate_function (column_2)
FROM
tbl_name
GROUP BY
column_1
HAVING
condition;
The HAVING
clause sets the condition for group rows created by the GROUP BY
clause after the GROUP BY
clause applies while the WHERE
clause sets the condition for individual rows before GROUP BY
clause applies. This is the main difference between the HAVING
and WHERE
clauses.
The
HAVING
clause is evaluated before theSELECT
- so the server doesn't yet know about that alias.
The following are rules applied to the queries:
- Both queries must return the same number of columns.
- The corresponding columns in the queries must have compatible data types.
The UNION
operator combines result sets of two or more SELECT statements into a single result set.
SELECT
column_1,
column_2
FROM
tbl_name_1
UNION
SELECT
column_1,
column_2
FROM
tbl_name_2;
The UNION operator removes all duplicate rows unless the UNION ALL is used.
The INTERSECT
operator returns any rows that are available in both result set or returned by both queries.
SELECT
column_list
FROM
A
INTERSECT
SELECT
column_list
FROM
B;
The EXCEPT
operator returns distinct rows from the first (left) query that are not in the output of the second (right) query.
SELECT column_list
FROM A
WHERE condition_a
EXCEPT
SELECT column_list
FROM B
WHERE condition_b;
ELECT
c1,
c2,
aggregate_function(c3)
FROM
table_name
GROUP BY
GROUPING SETS (
(c1, c2),
(c1),
(c2),
()
);
** EXAMPLE **
QUERY-1
Following query defines a grouping set of the brand and segment. It returns the number of products sold by brand and segment.
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
brand,
segment;
QUERY-2
The following query finds the number of product sold by brand. It defines a grouping set of the brand:
SELECT
brand,
NULL,
SUM (quantity)
FROM
sales
GROUP BY
brand;
QUERY-3
The following query finds the number of products sold by segment. It defines a grouping set of the segment:
SELECT
NULL,
segment,
SUM (quantity)
FROM
sales
GROUP BY
segment;
QUERY-4
The following query finds the number of products sold for all brands and segments. It defines an empty grouping set.
SELECT
NULL,
NULL,
SUM (quantity)
FROM
sales;
Because the
UNION ALL
requires all result sets to have the same number of columns with compatible data types, you need to adjust the queries by addingNULL
to the selection list of each
QUERY-1
UNION ALL
QUERY-2
UNION ALL
QUERY-3
UNION ALL
QUERY-4
Even though the code works as you expected, it has two main problems. First, it is quite lengthy. Second, it has a performance issue because PostgreSQL has to scan the sales table separately for each query.
To apply this syntax to the example above, you can use GROUPING SETS instead of UNION ALL as shown in the following query:
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
GROUPING SETS (
(brand, segment),
(brand),
(segment),
()
);
The GROUPING
function accepts a name of a column and returns bit 0 if the column is the member of the current grouping set and 1 otherwise. See the following example:
SELECT
GROUPING(brand) grouping_brand,
GROUPING(segment) grouping_segment,
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
GROUPING SETS (
(brand, segment),
(brand),
(segment),
()
)
ORDER BY
brand,
segment;
*** shorter way for GROUPING SETS is CUBE ***
CUBE(c1,c2,c3)
GROUPING SETS (
(c1,c2,c3),
(c1,c2),
(c1,c3),
(c2,c3),
(c1),
(c2),
(c3),
()
)
*** is a special ***
ROLLUP(c1 (main column),c2,c3)
GROUPING SETS (
(c1, c2, c3)
(c1, c2)
(c1)
()
)
For example
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
ROLLUP (brand (main column), segment)
ORDER BY
brand,
segment;
A subquery
is a query
nested inside
another query such as SELECT
, INSERT
, DELETE
and UPDATE
.
- First, executes the subquery.
- Second, gets the result and passes it to the outer query.
- Third, executes the outer query.
If the subquery returns any row, the EXISTS operator returns true. If the subquery returns no row, the result of EXISTS operator is false. The EXISTS operator only cares about the number of rows returned from the subquery, not the content of the rows, therefore, the common coding convention of EXISTS operator is as follows:
EXISTS (SELECT 1 FROM tbl WHERE condition);
expression operator ANY(subquery)
- The
subquery
must return exactlyone
column. - The
ANY
operator must be preceded by one of the following comparison operator =, <=, >, <, > and <>
The
ANY
operator returnstrue
if any value of the subquery meets the condition, otherwise, it returnsfalse
Note that
SOME
is a synonym forANY
, meaning that you can substituteSOME
forANY
in any SQL statement.
*** IMPORTANT EXAMPLE ***
The following example returns the length of movie with maximum length
for each category:
SELECT
MAX( length )
FROM
film
INNER JOIN film_category
USING(film_id)
GROUP BY
category_id;
Subquery in the following statement that finds the films whose lengths are greater than or equal to the maximum length of any film category (In this case it's mean minimum length in subquery result) :
SELECT title
FROM film
WHERE length >= ANY(
SELECT MAX( length )
FROM film
INNER JOIN film_category USING(film_id)
GROUP BY category_id );
The
= ANY
is equivalent toIN
operator.
However
x <> ANY (a,b,c)
is equivalent tox <> a OR <> b OR x <> c
comparison_operator ALL (subquery)
- The ALL operator must be preceded by a comparison operator such as equal (=), not equal (!=), greater than (>), greater than or equal to (>=), less than (<), and less than or equal to (<=).
- The ALL operator must be followed by a subquery which also must be surrounded by the parentheses.
column_name > ALL (subquery)
the expression evaluates to true if a value isgreater
than thebiggest
value returned by the subquery.column_name >= ALL (subquery)
the expression evaluates to true if a value isgreater than or equal
to thebiggest
value returned by the subquery.column_name < ALL (subquery)
the expression evaluates to true if a value isless
than thesmallest
value returned by the subquery.column_name <= ALL (subquery)
the expression evaluates to true if a value isless than or equal
to thesmallest
value returned by the subquery.column_name = ALL (subquery)
the expression evaluates to true if a value isequal to any
value returned by the subquery.column_name != ALL (subquery)
the expression evaluates to true if a value isnot equal to any
value returned by the subquery.
Common Table Expressions are temporary in the sense that they only exist during the execution of the query.
WITH cte_name (column_list) AS (
CTE_query_definition
)
statement;
Common Table Expressions or CTEs are typically used to simplify complex joins and subqueries in PostgreSQL.
For example:
WITH cte_film AS (
SELECT
film_id,
title,
(CASE
WHEN length < 30 THEN 'Short'
WHEN length < 90 THEN 'Medium'
ELSE 'Long'
END) AS length
FROM
film
)
SELECT
film_id,
title,
length
FROM
cte_film
WHERE
length = 'Long'
ORDER BY
title;
OR
WITH cte_rental AS (
SELECT staff_id,
COUNT(rental_id) rental_count
FROM rental
GROUP BY staff_id
)
SELECT s.staff_id,
first_name,
last_name,
rental_count
FROM staff s
INNER JOIN cte_rental USING (staff_id);
-
Improve the readability of complex queries. You use CTEs to organize complex queries in a more organized and readable manner.
-
Ability to create recursive queries. Recursive queries are queries that reference themselves. The recursive queries come in handy when you want to query hierarchical data such as organization chart or bill of materials.
-
Use in conjunction with window functions. You can use CTEs in conjunction with window functions to create an initial result set and use another select statement to further process this result set
Structure:
WITH RECURSIVE cte_name AS(
Non-recursive_CTE_query_definition -- Non-recursive term: the non-recursive term is a CTE query definition that forms the base result set of the CTE structure.
UNION [ALL]
Recursive_CTE_query_definition -- Recursive term: the recursive term is one or more CTE query definitions joined with the non-recursive term using the UNION or UNION ALL operator. The recursive term references the CTE name itself.
) SELECT * FROM cte_name;
- Execute the non-recursive term to create the base result set (R0).
- Execute recursive term with Ri as an input to return the result set Ri+1 as the output.
- Repeat step 2 until an empty set is returned. (termination check)
- Return the final result set that is a UNION or UNION ALL of the result set R0, R1, … Rn Example:
CREATE TABLE employees (
employee_id serial PRIMARY KEY,
full_name VARCHAR NOT NULL,
manager_id INT
);
WITH RECURSIVE subordinates AS (
SELECT
employee_id,
manager_id,
full_name
FROM
employees
WHERE
employee_id = 2
UNION
SELECT
e.employee_id,
e.manager_id,
e.full_name
FROM
employees e
INNER JOIN subordinates s ON s.employee_id = e.manager_id
) SELECT
*
FROM
subordinates;
In this section, you will learn how to insert data into a table with the INSERT statement, modify existing data with the UPDATE statement, and remove data with the DELETE statement. Besides, you learn how to use the upsert statement to merge data.
INSERT INTO table (column1, column2, …)
VALUES
(value1, value2, …),
(value1, value2, …) ,...;
[RETURNING id;]
UPDATE table
SET column1 = value1,
column2 = value2 ,...
WHERE
condition;
UPDATE A
SET A.c1 = expression
FROM B
WHERE A.c2 = B.c2;
DELETE FROM table
WHERE condition;
or
DELETE FROM table
USING another_table
WHERE table.id = another_table.id AND …
INSERT INTO table_name(column_list) VALUES(value_list)
ON CONFLICT { ON CONSTRAINT constraint_name | target_name} `action`;
The action
can be:
DO NOTHING
– means do nothing if the row already exists in the table.DO UPDATE SET column_1 = value_1, .. WHERE condition
– update some fields in the table.