Postgres Performance Analysis Scripts
Or How to Stop Worrying About Database Timeouts And Learn to Love These Scripts
Query Performance Analysis Scripts
The `pg_stat_activity` and `pg_stat_statements` Tables are Your Friends
pg_stat_activity tables provide valuable insights on query performance. The
pg_stat_statments table holds information on queries that ran in the past, while the
pg_stat_activity table holds information on queries running right this moment.
pg_stat_statements table has has one row per for each query that ran. It provides useful information in columns like:
pg_stat_statements.query— the query text
pg_stat_statements.calls— The number of times this query was run
pg_stat_statements.total_time— The total time (ms) spent on the query
pg_stat_activity table has one row per server process at this moment. It provides insightful info in columns like:
pg_stat_activity.query_startwhen query associated with this process started
pg_stat_activity.querythe text of this query
pg_stat_activity.statethe state of this query, possible values include
By querying these tables, we can answer questions like:
What are the slowest queries?
What are the slowest queries running right now sorted by running time
Which SELECT queries have read/touched the greatest number of rows
pg_stat_statement provides information about queries that already ran, to answer questions about queries running right this moment, we query the
pg_stat_activity table. It can answer questions like:
How many queries have been running for longer than a minute?
How long have the oldest queries been running?
How many queries are currently writing to the same table?
Kill Long Running Queries
pg_stat_activity table shows queries that are running right now, we can use its
pg_cancel_backend function to kill problematic queries.
For example, if
pg_stat_activity reveals long-running queries clogging up your database, a hot fix can just be to kill those queries. This will give a chance for the shorter queries to use the database. It’s better to have a few internal server errors caused by killing these queries, than to have the entire database unavailable.
Lock Analysis Scripts
Why Are There Locks?
Postgres queries acquire various types of locks on rows and tables. Depending on the type of lock, other queries may or not may be able to run concurrently depending on whether or not the locks conflict. The purpose of these locks is to resolve ambiguity regarding how the concurrent queries should behave. For example, if an
INSERT query is running concurrently with a
SELECT query, should the
SELECT query return the newly inserted row? Or what happens if two
UPDATE queries are running on the same row? These lock mechanism are part of a database design pattern called MVCC (Multi-version Concurrency Control) which is worth understanding.
But for the purpose of this article, just know that locks on rows/tables are mechanisms which prevent certain other queries from accessing that same row/tables until that lock is released.
Conflicting Locks Means Blocking Queries
Certain queries will acquire certain locks. If a query acquires a lock that conflicts with another query’s lock, then the second query cannot run until the first query has completed, or we say the first query is blocking the second query.
For example, a
SELECT query will acquire an
ACCESS SHARE lock on a table. An
UPDATE query will acquire a
ROW EXCLUSIVE lock on table. Because
ACCESS SHARE and
ROW EXCLUSIVE locks do not conflict, a
UPDATE query may both run on the same table concurrently (which makes sense).
CREATE INDEX without
CONCURRENTLY command acquires a
SHARE lock on a table, which conflicts with a
ROW EXCLUSIVE , which means we cannot
UPDATE query cannot run until the
CREATE INDEX has completed. However, an
ACCESS SHARE lock does not conflict with a
SHARE lock which means a
SELECT query on a table while a
CREATE INDEX is also running on that table.
The pg_locks Table
pg_locks table provides information on the locks through the following columns:
locktype— the type of object being locked
relation— the table being locked,
transactionid— the id of the transaction that is doing the locking
pid— the process id holding or waiting for this lock
granted— a boolean field where
truemeans the lock is being held and
falsemeans process is waiting to acquire the lock
From this table we can find out what queries are blocking other queries.
Index Usage Scripts
Because indices improve read query performance at the expense of write query performance, it’s a good idea to look for un-used (or even low use) indices and drop them.
Here’s a script to find unused or low-use indices:
Cache Performance Scripts
Is there Enough Shared Buffer Cache?
Postgres caches recent queries in memory called the shared buffer cache (
pg_statio_user_tables has as rows representing various stats on each of the (user) tables. The two columns of interest we’ll be looking at are:
pg_statio_user_tables.heap_blks_read— Number of disk blocks read from a table (ie. missed cache)
pg_statio_user_tables.heap_blks_hits— Number of buffer hits from this table (ie. cache hit)
From these two columns, we can determine the cache hit ratio of a table by:
Or we can determine the average cache hit ratio across all tables:
Similarly, there is a
pg_statio_user_indexes which has as rows various stats of indexes:
pg_statio_user_indexes.indx_blks_read— Number of disk block reads from this index
pg_statio_user_indexes.indx_blks_hit— Number of buffer hits from this index.
We can determine the cache hit ratio for a given index or across all indexes:
A cache hit ratio of 99% is considered “well-engineered”. If it’s significantly less than that, we will need to increase the shared buffer cache (which ranges from 15% to 25% of the total system memory).
Maximum Number of Connections
The maximum number of connections Postgres can support is limited by the Postgres server size, in particular the memory. To see the maximum number of connections the database has, run:
SHOW max_connections; max_connections
To see how many connections are being used at the moment, and what their statuses are or which applications are using them:
See these connections can give great insight how how to re-architect your application to use those connections more effectively. It can help you decide to use a connection pooling proxy like pgbouncer.
Resetting the Stats
After implementing changes, be sure to reset all the
pg_stat* tables using
pg_stat_reset() to check for changes in performance.