Postgres Performance Analysis Scripts
Or How to Stop Worrying About Database Timeouts And Learn to Love These Scripts
Query Performance Analysis Scripts
The `pg_stat_activity` and `pg_stat_statements` Tables are Your Friends
The pg_stat_statements
and pg_stat_activity
tables provide valuable insights on query performance. The pg_stat_statments
table holds information on queries that ran in the past, while the pg_stat_activity
table holds information on queries running right this moment.
The pg_stat_statements
table has has one row per for each query that ran. It provides useful information in columns like:
pg_stat_statements.query
— the query textpg_stat_statements.calls
— The number of times this query was runpg_stat_statements.total_time
— The total time (ms) spent on the query
The pg_stat_activity
table has one row per server process at this moment. It provides insightful info in columns like:
pg_stat_activity.query_start
when query associated with this process startedpg_stat_activity.query
the text of this querypg_stat_activity.state
the state of this query, possible values includeactive
,idle
, etc.
By querying these tables, we can answer questions like:
What are the slowest queries?
What are the slowest queries running right now sorted by running time
Which SELECT queries have read/touched the greatest number of rows
While pg_stat_statement
provides information about queries that already ran, to answer questions about queries running right this moment, we query the pg_stat_activity
table. It can answer questions like:
How many queries have been running for longer than a minute?
How long have the oldest queries been running?
How many queries are currently writing to the same table?
Kill Long Running Queries
Because the pg_stat_activity
table shows queries that are running right now, we can use its pg_cancel_backend
function to kill problematic queries.
For example, if pg_stat_activity
reveals long-running queries clogging up your database, a hot fix can just be to kill those queries. This will give a chance for the shorter queries to use the database. It’s better to have a few internal server errors caused by killing these queries, than to have the entire database unavailable.
Lock Analysis Scripts
Why Are There Locks?
Postgres queries acquire various types of locks on rows and tables. Depending on the type of lock, other queries may or not may be able to run concurrently depending on whether or not the locks conflict. The purpose of these locks is to resolve ambiguity regarding how the concurrent queries should behave. For example, if an INSERT
query is running concurrently with a SELECT
query, should the SELECT
query return the newly inserted row? Or what happens if two UPDATE
queries are running on the same row? These lock mechanism are part of a database design pattern called MVCC (Multi-version Concurrency Control) which is worth understanding.
But for the purpose of this article, just know that locks on rows/tables are mechanisms which prevent certain other queries from accessing that same row/tables until that lock is released.
Conflicting Locks Means Blocking Queries
Certain queries will acquire certain locks. If a query acquires a lock that conflicts with another query’s lock, then the second query cannot run until the first query has completed, or we say the first query is blocking the second query.
For example, a SELECT
query will acquire an ACCESS SHARE
lock on a table. An UPDATE
query will acquire a ROW EXCLUSIVE
lock on table. Because ACCESS SHARE
and ROW EXCLUSIVE
locks do not conflict, a SELECT
and UPDATE
query may both run on the same table concurrently (which makes sense).
However, a CREATE INDEX
without CONCURRENTLY
command acquires a SHARE
lock on a table, which conflicts with a ROW EXCLUSIVE
, which means we cannot UPDATE
query cannot run until the CREATE INDEX
has completed. However, an ACCESS SHARE
lock does not conflict with a SHARE
lock which means a SELECT
query on a table while a CREATE INDEX
is also running on that table.
The pg_locks Table
The pg_locks
table provides information on the locks through the following columns:
locktype
— the type of object being lockedrelation
— the table being locked,null
iflocktype
is notrelation
transactionid
— the id of the transaction that is doing the lockingpid
— the process id holding or waiting for this lockgranted
— a boolean field wheretrue
means the lock is being held andfalse
means process is waiting to acquire the lock
From this table we can find out what queries are blocking other queries.
Index Usage Scripts
Because indices improve read query performance at the expense of write query performance, it’s a good idea to look for un-used (or even low use) indices and drop them.
Here’s a script to find unused or low-use indices:
Cache Performance Scripts
Is there Enough Shared Buffer Cache?
Postgres caches recent queries in memory called the shared buffer cache (shared_buffers
in postgresql.conf
). The pg_statio_user_tables
has as rows representing various stats on each of the (user) tables. The two columns of interest we’ll be looking at are:
pg_statio_user_tables.heap_blks_read
— Number of disk blocks read from a table (ie. missed cache)pg_statio_user_tables.heap_blks_hits
— Number of buffer hits from this table (ie. cache hit)
From these two columns, we can determine the cache hit ratio of a table by:
Or we can determine the average cache hit ratio across all tables:
Similarly, there is a pg_statio_user_indexes
which has as rows various stats of indexes:
pg_statio_user_indexes.indx_blks_read
— Number of disk block reads from this indexpg_statio_user_indexes.indx_blks_hit
— Number of buffer hits from this index.
We can determine the cache hit ratio for a given index or across all indexes:
A cache hit ratio of 99% is considered “well-engineered”. If it’s significantly less than that, we will need to increase the shared buffer cache (which ranges from 15% to 25% of the total system memory).
Database Connections
Maximum Number of Connections
The maximum number of connections Postgres can support is limited by the Postgres server size, in particular the memory. To see the maximum number of connections the database has, run:
SHOW max_connections; max_connections
-----------------
400
(1 row)
To see how many connections are being used at the moment, and what their statuses are or which applications are using them:
See these connections can give great insight how how to re-architect your application to use those connections more effectively. It can help you decide to use a connection pooling proxy like pgbouncer.
Resetting the Stats
After implementing changes, be sure to reset all the pg_stat*
tables using pg_stat_reset()
to check for changes in performance.