PgAnalyze
When running a web application on PostreSQL, it's nice to monitor the SQL queries continuously, so that it's possible to detect slow / not well optimized queries as soon as possible. This tool should give you interesting information about each of the suspicious queries.
This project was born somehow naturally - a few years ago I worked for a company that ran several middle-sized web applications (written in PHP) on PostgreSQL, and more and more often we had to solve performance problems. Obviously the cause was at the database layer, but it was very difficult to determine which SQL query (in what application, over what database) were causing the problems. Another question was whether the problems are caused by continuously increasing number of visitors, or if the SQL queries are just poorly formulated.
In the end we found out that PostgreSQL supports logging of slow queries, enabled it in postgresql.conf and after about a week we manually determined the most problematic SQL queries. After an analysis we have reformulated some of the queries, and we have modified parts of the applications - the performance problems almost entirely disappeared. The response rate noticeably improved, the variance of the response time decreased, etc.
We've decided that it would be nice to perform such monitoring regularly, and not just on the production server. We wrote a script that filtered out the slow queries from the log, and built quite complex HTML report. The first version of the pganalyze tool was born.
Present
When using the tool it became obvious that while PostgreSQL allows to log slow SQL queries into it's own log, there are serious disadvantages:
- format of log items may be different for each machine - it depends if syslog is used or not, and on postgresql.conf values too
- items in the log (and slow SQL queries too) may be divided into multiple rows and mixed with one another - that requires quite complicated parsing and identification of lines belonging to the same query. And it may not be possible at all (for example in case of improper postgresql.conf values)
- similar tool may be quite useful for other databases, but this is prevented by tight coupling with PostgreSQL log
- in PostgreSQL it's not possible to set some sophisticated logging parameters (for example probability that a query will be logged)
Because of these problems the native PostgreSQL logging is not used to monitor slow queries anymore - it has been moved to a PHP library encapsulating all DB operations. Queries are logged in a unified format (one query by line, etc.) so the parsing is much simpler, and it does not allow whether the queries were logged from PostgreSQL or any other database.




