DB tuning using pgstatspack
If you have ever engaged in performance tuning in one of the newer Oracle versions, maybe you have met the legendary statspack (see Oracle Statspack Survival Guide), or its successor Automatic Workload Repository (AWR), whose main principle lies in creating "snapshots" of important internal values (number of queries, transactions, disk reads, cache reads, etc) and consequent analysis of differences between snapshots. It is not widely known that there is a similar tool for PostgreSQL called pgstatspack, which references to Statspack even by its name - it is based on the very same principle (analysis of snapshot differences), and although it is more simple, it may provide invaluable services when tuning performance.
As already noted, the principle of statspack lies in creating snapshots of internal information available in V$ views, and creating a snapshot means storing the data from a view into a designated tables. When performing analysis of the collected data, two snapshots are loaded and their difference is computed, which is then processed and displayed to the user.
Statspack vs. pgstatspack
Pgstatspack is based on the very same principle as statspack, i.e. it stores "snapshots" of data from interesting system views (pg_stat_* and pg_statio_*) into designated tables. Obviously there are many differences - let's see two of the most important. The first difference relates to the level of information collected - while the original statspack collects data about the whole instance (as well as individual objects), pgstatspack collects data primarily at the level of individual objects (tables, indexes, sequences, ...). The second difference relates to the range of collected information, and (if we dismiss different levels of data collection) we may say that Oracle collects the following types of information (along with the information if this is available in PostgreSQL):
- rollback segment - thanks to the MVCC architecture of PostgreSQL, there is nothing like a rollback segment, so this statistics makes no sense
- row cache - basic information about blocks read from cache are available at the database level (pg_stat_database) as well as at the object level (pg_stat_all_tables)
- events (system, background, session) - events are not available in PostgreSQL
- system statistics - basis system statistics are available, partly at the database level (pg_stat_database), partly at the bgwriter level (pg_stat_bgwriter), but the bgwriter statistics are not yet available in pgstatspack
- wait statistics - statistics of "wait" are not yet available in PostgreSQL (although dtrace or a similar technology may provide this)
- lock statistics - not yet collected in PostgreSQL
- latch statistics - latches in Oracle are used to synchronize access to shared resources (e.g. query plan cache in a shared pool etc.), but its statistics is not yet available in PostgreSQL
- SQL statements, query plans - not available in PostgreSQL
- statistics at the segment level, i.e. DB objects (logical/physical reads, row lock, waits, buffer busy waits) - yes, this is available in PostgreSQL (but not all of the statistics are supported by PostgreSQL - see above)
That is all regarding differences between statspack and pgstatspack. And now lets see how to use pgstatspack in practice.
A little bit about snapshots
I'm not going to describe pgstatspack here - it is well described in README file, and basically it's nothing more than creating several tables and a procedure that stores snapshot data into them. But first let's talk about a little bit of theory about snapshots.
Optimal interval between snapshots
As pgstatspack is based on snapshot differences, you need two snapshots - if you have a single snapshot you can't compute a difference and that's all. The question is how long should be the interval between snapshots. Optimal value strongly depends on the database activity, but the experience shows that it makes absolutely no sense to use intervals spanning days - the best length span minutes (e.g. 5 to 15 minutes), and maybe hours in some rare cases (but that is the upper bound).
The reason is quite simple - pgstatspack may be used especially when solving (short-term or long-term) performance problems, indicated by a step change in collected data. An example may be a case when the database suddenly stops to use an index on a large table, which causes a step increase of number of sequential scans of the table (and number of sequentially read blocks), and conversely step decrease of number of index scans, number of blocks read from an index, etc.
In case of short-term performance problems, using a too long interval between snapshots may cause the step change become more difficult to identify, which contradicts the main purpose of snapshots. On the other hand in case of long-term problems (e.g. during the whole interval), it actually makes no difference if you collect the data for 5 minutes or 5 hours - you have to normalize the values anyway (to convert all the values to "per second" values - see below). So in case of long-term problems it does not matter how long the interval is.
It is obvious that too long intervals won't give you any advantage - in case of long-term problems the interval length generally does not matter, and in case of short-term problems it limits the usability of collected data when identifying the causes.
Collecting Snapshots at the time when problems occur is not sufficient
Let's say you are trying to solve an urgent performance problem - 15 minutes ago everything was fine, but suddenly the database has gone bottoms up, which badly influences the users (e.g. web applications accessing the database). You have to identify the problem and fix it, but you have absolutely no clue what went wrong and where to start, so you decide to use pgstatspack believing it will point to a table causing the problems etc. So you take two snapshots with 5 minutes between them, and you compute the difference between them - but how do you decide which values are fine and which values indicate a step change? Sure, you may guess based on your experience, but that is not reliable.
That means that when solving performance problems, the snapshots taken when everything was fine are just as valuable as snapshots made when there was a problem. The best way how to do this is to create snapshots regularly, e.g. using a cron.
Pgstatspack usage in practice
And now for some practical example of pgstatspack usage - I have to warn you in advance that this is an artificial example, serving just for purpose of demonstration. We'll use a very simple table
CREATE TABLE pgstatspack_test (
id INTEGER,
value TEXT
);
CREATE UNIQUE INDEX pgstatspack_test_idx ON pgstatspack_test(id);
filled with about 100.000 rows (about 35MB for a table and 2MB for the index):
INSERT INTO pgstatspack_test SELECT i, repeat(md5(i::text), 10)
FROM generate_series(1,100000) s(i);
ANALYZE pgstatspack_test;
Say we want to load 1000 random rows using the 'id' value using the following procedure
CREATE OR REPLACE FUNCTION pgstatspack_test_proc() RETURNS void AS $$
DECLARE
v_text TEXT;
BEGIN
FOR i IN 1..1000 LOOP
SELECT value INTO v_text FROM pgstatspack_test
WHERE id = (SELECT (99999*random())::int + 1);
END LOOP;
END;
$$ language plpgsql;
Lets execute it (and create a snapshots before and after the execution):
SELECT pgstatspack_snap('index-before');
SELECT pgstatspack_test_proc();
SELECT pgstatspack_snap('index-after');
and then generate the report using pgstatspack_report.sh shell script - an example I've created on my machine may be downloaded here: pgstatreport_index_scan.txt.
Now lets do the same with index scan (and bitmap index scan) disabled.
SET enable_indexscan = off;
SET enable_bitmapscan = off;
-- at this point you'll have to recreate the testing procedure,
-- otherwise the original query (using index) will be used
SELECT pgstatspack_snap('index-before');
SELECT pgstatspack_test_proc();
SELECT pgstatspack_snap('index-after');
And then create a report again - my report may be downloaded here: pgstatreport_seq_scan.txt.
So we have two reports, and now we have to analyze them - we want to "identify" why the second execution took about 10x longer (36 seconds compared to 4 seconds of the first execution).
Database statistics
The first interesting values are available in the "database statistics" section - if you have multiple databases in the PostgreSQL cluster, so this section may be interesting if you need to identify which of the databases generate most of the load (I/O operations etc.).
We already know that our problems are caused by the "pgstatspack" database, so I will list only the particular row (and I will do the same in the following sections).
In case of index usage you will get the following numbers
| database | tps | hitrate | lio ps | pio ps | rollbk ps |
| pgstatspack | 0.47 | 89 | 1057.19 | 106.24 | 0.00 |
while without the index (e.g. with the sequential scan) you will get this
| database | tps | hitrate | lio ps | pio ps | rollbk ps |
| pgstatspack | 0.22 | 65 | 62244.63 | 21195.78 | 0.00 |
Even from this first report it is obvious that there is a huge increase in the number of I/O operations, logical (60x) as well as physical (20x) ones. This may not mean an error - it may be caused by a temporary increase of activity (more users mean more loaded data).
Let's see the other sections of the report as they will bring more light - in our case the interesting sections are "Tables ordered by percentage of tuples scanned" and "Indexes ordered by scans."
Tables ordered by percentage of tuples scanned
Lets see the statistics of tables - with indexes it looks like this
| table | rows read % | tab hitrate | idx hitrate | tab read | tab hit | idx read | idx hit |
| pgstatspack_test | 20 | 55 | 99 | 447 | 553 | 0 | 2001 |
and without indexes it looks like this
| table | rows read % | tab hitrate | idx hitrate | tab read | tab hit | idx read | idx hit |
| pgstatspack_test | 99 | 65 | 0 | 769837 | 1489737 | 0 | 0 |
As you can see, without the indexes there's much more loaded blocks from the table and conversely much less index hits, which is an umistakable sign of sequential scan used instead of an index scan, just like the increase of percentage of rows read to 99%.
Indexes ordered by scans
Usage of sequential scan instead of index scan may be confirmed in the section "Indexes ordered by scans" - while with indexes it looks like this
| index | scans | tup read | tup fetch | idx blks read | idx blks hit |
| pgstatspack_test_idx | 1000 | 1000 | 1000 | 0 | 2001 |
without the index scans the index (pgstatspack_test_idx) is not listed here at all (as the index is not used because we disabled it).
Conclusion
I hope I have successfully demonstrated how useful may the pgstatspack be. Obviously it is not a tool for users who don't know anything about the databases - if you are working with the databases and are familiar with the notions of sequential scans, index scans, cache, etc. then pgstatspack may be a tool for you.





Hi
really interesting stuff for postgresSQL
thank you !
patrick boulay
www.zion-dba.com
Thanks, nice to see it's still useful for someone else.