OpenStats - benchmarking
When developing and deploying applications, performance testing is quite important - it allows you to allocate database tables to tablespaces, determine performance limits of the system, etc. For the OpenStats project I wrote a small Java multi-threaded benchmarking tool, reading SQL queries from a file and executing them in a database.
The mentioned utility is actually a single Java class OpenStatsBench, places in the misc/ directory - this class reads SQL queries from a file produced by the build-data-bench.php script, and executes them in a database. Let's build the testing data:
$ php build-data-bench.php days=7 \
sessions-per-day=10000 \
actions-per-session=10 \
parameters-per-action=2 \
visitors=10000 \
pages=100 \
types=5 \
languages=eng:ces \
start=2008-01-01 > bench.sql
This command build data from seven days, every day there will be about 10000 sessions, each session will have about 10 actions, and each action will have 2 parameters. Sessions will be assigned to 10000 unique visitors by random. Actions will be linked to 100 pages (ID from 1 to 100), and there will be 5 types of actions (with ID from 1 to 5). There will be two languages ("eng" and "ces"), and the first day will be 2008-01-01.
The data produced by the script will be redirected to file bench.sql, and this file will be subsequently used as input to the benchmarking tool written in Java - OpenStatsBench class. This class contains definition of several threads, one of them (called "feeder") reads SQL queries from the file and inserting them into a queue, and several threads (called "executor") reading them from the queue and executing them over a database. The executor threads collect statistics about inserts executed (to what table, how long did it take). At the end each executor prints a brief statistics.
$ javac OpenStatsBench.java
$ java -cp postgresql-jdbc.jar:. OpenStats.java --file bench.sql --runners 3 \
--host localhost --port 5432 --dbname ostest --user osuser --pass ospass
The first command compiles the java utility, the other one runs it - you have to issue the file to read the data from (it was built by the build-data-bench.php script), number of threads running the SQL queries (3 in our case), and database connection details (host, port, database name, username and password).
Example of OpenStatsBench output
The following text contains output of this utility - although not bad, don't forget the utility was executed on my desktop (AMD Athlon 64 X2 4400+, 2GB DDR2 RAM, ordinary SATA 7.2k drives without RAID), so the results on a regular server with quality SCSI / SAS controller, fast drives, large amount of RAM and a powerful CPU should be much better. And of course, the statistics are influenced by various PostgreSQL settings.
Executing the OpenStatsBench using the command (3 executors etc.), we'll get an output similar to this one (each executor produces a separate statistics):
================== Executor 0 ================== Sessions: 3330 in 7328 ms (454 rows / sec) Actions: 33300 in 74929 ms (444 rows / sec) Parameters: 66600 in 141071 ms (472 rows / sec) ------------------------------------------------- Total: 103230 in 223328 ms (462 rows / sec) ================== Executor 1 ================== Sessions: 3349 in 7399 ms (452 rows / sec) Actions: 33490 in 74561 ms (449 rows / sec) Parameters: 66980 in 141445 ms (473 rows / sec) ------------------------------------------------- Total: 103819 in 223405 ms (464 rows / sec) ================== Executor 2 ================== Sessions: 3321 in 8045 ms (412 rows / sec) Actions: 33210 in 74474 ms (445 rows / sec) Parameters: 66420 in 140859 ms (471 rows / sec) ------------------------------------------------- Total: 102951 in 223378 ms (460 rows / sec)
As you can see from the output, inserts into all three tables are about equally fast (in all cases about 1300 rows per second).
If you are interested in system load during the test (to determine performance bottlenecks - CPU / drives, etc.) you may use utility "dstat" - a typical example of the output is (just a few lines):
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 53 21 9 10 1 5|8192B 19M| 0 0 | 0 0 |3882 12k 53 21 11 8 2 7| 0 19M| 0 0 | 0 0 |4036 12k 47 20 9 17 1 6| 0 20M| 0 0 | 0 0 |3670 11k 57 21 11 5 0 5| 0 18M| 0 0 | 0 0 |3642 12k 49 23 9 9 2 7| 0 19M| 0 0 | 0 0 |3983 12k 50 22 10 9 2 7| 0 19M| 0 0 | 0 0 |3933 11k 52 20 10 8 3 6|8192B 19M| 0 0 | 0 0 |4078 12k ....
The "top" command may be used to get information about each process:
top - 17:58:02 up 1:33, 1 user, load average: 2.20, 0.75, 0.44 Tasks: 116 total, 4 running, 112 sleeping, 0 stopped, 0 zombie Cpu(s): 53.3%us, 21.3%sy, 0.0%ni, 8.5%id, 9.8%wa, 1.8%hi, 5.3%si, 0.0%st Mem: 2059928k total, 1406112k used, 653816k free, 55912k buffers Swap: 4144760k total, 224k used, 4144536k free, 976444k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9122 postgres 20 0 36196 5164 3540 R 44 0.3 0:31.22 postmaster 9120 postgres 20 0 36168 5096 3484 S 44 0.2 0:31.26 postmaster 9121 postgres 20 0 36176 5176 3548 R 41 0.3 0:31.84 postmaster 9101 vampire 20 0 829m 35m 7972 S 22 1.7 0:20.42 java
You may see the 3 "postmaster" processes servicing the "executors" are using about the same CPU (41% - 44%), although the overhead of the Java process is not negligible (22%). That's why I recommend running this tool on a different machine than the database.




