pgexperts / pgx_scripts Goto Github PK

A collection of useful little scripts for database analysis and administration, created by our team at PostgreSQL Experts.

License: Other

Shell 80.91% Python 14.22% PLpgSQL 4.87%

pgx_scripts's Introduction

pgx_scripts

A collection of useful little scripts for database analysis and administration, created by our team at PostgreSQL Experts.

bloat

Queries to estimate bloat in tables and indexes.

index_bloat_check.sql

An overhauled index bloat check. Lists indexes which are likely to be bloated and estimates bloat amounts. Requires PostgreSQL > 8.4, superuser access, and a 64-bit compile. Only works for BTree indexes, not GIN, GiST, or more exotic indexes. Still needs cleanup.

table_bloat_check.sql

An overhauled table bloat check. Lists tables which are likely to be bloated and estimates bloat amounts. Requires PostgreSQL >= 8.4 and a 64-bit compile. Cannot estimate bloat for tables containing types with no stats functions (such as original JSON).

no_stats_table_check.sql

Query to list all tables which have "no stats" columns and thus can't be estimated.

kill_idle

kill_idle_91.sql

A stored procedure which kills idle transactions on PostgreSQL versions 8.3 to 9.1. Intended to be called by a cron job. Takes idle time, polling time, and exempted user list parameters. Outputs pipe-delimited text with the data about the sessions killed.

kill_idle_93.sql

A stored procedure which kills idle transactions on PostgreSQL versions 9.2 and later. Intended to be called by a cron job. Takes idle time and exempted user list parameters. Outputs JSON with the data about the sessions killed.

Indexes

Various queries to introspect index usage.

fk_no_index.sql

Queries for foreign keys with no index on the referencing side. Note that you don't always want indexes on the referencing side, but this helps you decide if you do.

duplicate_indexes_fuzzy.sql

Check indexes and looks at whether or not they are potentially duplicates. It does this by checking the columns used by each index, so it reports lots of false duplicates for partial and functional indexes.

needed_indexes.sql

Checks for tables which are getting too much sequential scan activity and might need additional indexing. Reports in four groups based on table size, number of scans, write activity, and number of existing indexes.

unneeded_indexes.sql

Checks for indexes with relatively light usage stats, for possible removal.

Locks

Tools and a set of queries to analyze lock-blocking.

transaction_locks.sql

Requires: Postgres 9.2+

Lists waiting transaction locks and what they're waiting on, if possible. Includes relation and query information, but realistically needs to be accompanied by full query logging to be useful. Needs to be run per active database.

table_locks.sql

Lists direct locks on tables which conflict with locks held by other sessions. Note that table locks are often short-lived, and as a result this will often result in zero rows.

Additional Contributors

In addition to the staff of PostgreSQL Experts, we are indebted to:

The authors of the check_postgres.pl script, especially Greg Sabino Mulainne, for supplying the original bloat queries on which our bloat queries are based.
Andrew Gierth for help on various system queries.
ioguix for collaborating on bloat calculation math.

pgx_scripts's People

Contributors

Stargazers

Watchers

Forkers

sahglie abeltsov sean- tsingson lifeisgoodmf denishpatel tonygaetani mladenuzelac swrd bondmi007 danrtichell thomasdarimont johnaohara btbytes c0psrul3 bratu-ionut cmabastar taojonesin syaroslavtsev douglasresende quarkness davidhooey ztyan gburiticato christopher-abastar rlittlefield techgaun segmond aorashi rwarren99 vanife yarwelp elevate nemoaa diogonicoleti zhuomingliang commandprompt dingguitao shahriyarsheikh stevenmg bhavikgit liuqian1990 replyvishu gregoryaparicio nvichare chrislefever anvk fxkraus asheet-bhaskar gridl kimsumin75 proteus-aholt xluffy-fork milosbabic nikolays logbue-dataxu perlchild joe2hpimn lukassfranklin davidgoussev wagner-buettner oddoye-david agnosy rvanvelzen creichert cristian357r4 r8tman cameroun ssoriche bajche vinnix xeor sahapasci rogeriofalcone ravi-code-ranjan amacvar deepdivenow srqway kotsachin mgiay radoslawosinski barazbay vijay-saini sosso francoisjacques mrsurname rekgrpth epasham senzacionale maniacs-ops bakink goryszewskig praveend19 minhdtb samezhao muvvas edib srikanth-medikonda xiaonuogantan benlorence

pgx_scripts's Issues

Misinterpretation of pg_index.indkey column results in overestimate of index bloat

In index_item_sizes portion of the index bloat query, indexes with multiple columns do not retain all attributes from the btree_index_atts subquery. This is due to a misunderstanding of the contents of the pg_index.indkey column.

From the documentation:

This is an array of indnatts values that indicate which table columns this index indexes. For example a value of 1 3 would mean that the first and the third table columns make up the index key. A zero in this array indicates that the corresponding index attribute is an expression over the table columns, rather than a simple column reference.

Thus if columns 2 and 3 are indexed, indkey will reflect '2 3'. However, in pg_attribute, the index attributes will be 1 and 2. Thus joining the two tables will result in only one attribute result (2==2), and it will be incorrect (table col2 != index col2).

There are generally two ways to fix this:

Alter the JOIN to pg_attribute to use indexrelid instead, and subsequently modify the JOIN to pg_stats similarly.
Replace the regexp_split_to_table call with generate_series bounded by pg_index.indnatts. If an index has 3 columns, the attributes in pg_attribute will be listed as 1, 2, and 3. This means the indkey split isn't strictly necessary.

Returns only 1 index on the table with the data

The query returns bloat and size information only on one index of the table even if tables has multiple indexes.

Incorrect where clause on btree_index_atts and pg_attribute join

This condition pg_attribute.attnum = ind_atts.attnum is filtering out majority of the bloated indexes.
Ref - JOIN btree_index_atts AS ind_atts ON pg_attribute.attrelid = ind_atts.indexrelid AND pg_attribute.attnum = ind_atts.attnum

This is because attnum of indexes in pg_attribute table will not always match with indkey of index in pg_index table. It will usually match for primary key or initial columns. Thus, pg_attribute.attnum = ind_atts.attnum condition has to be removed.

This condition : pg_stats.attname = pg_catalog.pg_get_indexdef(pg_attribute.attrelid, pg_attribute.attnum, TRUE) while joining with pg_stats will take care of the rest.

wrong join for index bloat

Hi, There is (IMO) a2 wrong joins in second block.
should be
JOIN btree_index_atts AS ind_atts ON pg_attribute.attrelid = ind_atts.indrelid AND pg_attribute.attnum = ind_atts.attnum
instead of
JOIN btree_index_atts AS ind_atts ON pg_attribute.attrelid = ind_atts.indexrelid AND pg_attribute.attnum = ind_atts.attnum

the following join, Is aso wrong. I use:

LEFT JOIN pg_stats ON pg_stats.schemaname = ind_atts.nspname
-- stats for regular index columns
AND ( (pg_stats.tablename = ind_atts.tablename AND pg_catalog.pg_get_indexdef(ind_atts.indexrelid, pg_attribute.attnum, TRUE) ~ pg_stats.attname)
-- stats for functional indexes
OR (pg_stats.tablename = ind_atts.index_name AND pg_stats.attname = pg_attribute.attname))

the join with pg_attributes include attnum but pg_attributes will number the attributes for the index at 1. the pg_index indkey is the physical index to the column within the table.

thanks for this query. its very helpful.