OpenStats - benchmarking

When developing and deploying applications, performance testing is quite important - it allows you to allocate database tables to tablespaces, determine performance limits of the system, etc. For the OpenStats project I wrote a small Java multi-threaded benchmarking tool, reading SQL queries from a file and executing them in a database.

The mentioned utility is actually a single Java class OpenStatsBench, places in the misc/ directory - this class reads SQL queries from a file produced by the build-data-bench.php script, and executes them in a database. Let's build the testing data:

$ php build-data-bench.php days=7 \
                           sessions-per-day=10000 \
                           actions-per-session=10 \
                           parameters-per-action=2 \
                           visitors=10000 \
                           pages=100 \
                           types=5 \
                           languages=eng:ces \
                           start=2008-01-01 > bench.sql

This command build data from seven days, every day there will be about 10000 sessions, each session will have about 10 actions, and each action will have 2 parameters. Sessions will be assigned to 10000 unique visitors by random. Actions will be linked to 100 pages (ID from 1 to 100), and there will be 5 types of actions (with ID from 1 to 5). There will be two languages ("eng" and "ces"), and the first day will be 2008-01-01.

The data produced by the script will be redirected to file bench.sql, and this file will be subsequently used as input to the benchmarking tool written in Java - OpenStatsBench class. This class contains definition of several threads, one of them (called "feeder") reads SQL queries from the file and inserting them into a queue, and several threads (called "executor") reading them from the queue and executing them over a database. The executor threads collect statistics about inserts executed (to what table, how long did it take). At the end each executor prints a brief statistics.

$ javac OpenStatsBench.java
$ java -cp postgresql-jdbc.jar:. OpenStats.java --file bench.sql --runners 3 \
       --host localhost --port 5432 --dbname ostest --user osuser --pass ospass

The first command compiles the java utility, the other one runs it - you have to issue the file to read the data from (it was built by the build-data-bench.php script), number of threads running the SQL queries (3 in our case), and database connection details (host, port, database name, username and password).

Example of OpenStatsBench output

The following text contains output of this utility - although not bad, don't forget the utility was executed on my desktop (AMD Athlon 64 X2 4400+, 2GB DDR2 RAM, ordinary SATA 7.2k drives without RAID), so the results on a regular server with quality SCSI / SAS controller, fast drives, large amount of RAM and a powerful CPU should be much better. And of course, the statistics are influenced by various PostgreSQL settings.

Executing the OpenStatsBench using the command (3 executors etc.), we'll get an output similar to this one (each executor produces a separate statistics):

================== Executor 0 ==================
Sessions: 3330 in 7328 ms (454 rows / sec)
Actions: 33300 in 74929 ms (444 rows / sec)
Parameters: 66600 in 141071 ms (472 rows / sec)
-------------------------------------------------
Total: 103230 in 223328 ms (462 rows / sec)

================== Executor 1 ==================
Sessions: 3349 in 7399 ms (452 rows / sec)
Actions: 33490 in 74561 ms (449 rows / sec)
Parameters: 66980 in 141445 ms (473 rows / sec)
-------------------------------------------------
Total: 103819 in 223405 ms (464 rows / sec)

================== Executor 2 ==================
Sessions: 3321 in 8045 ms (412 rows / sec)
Actions: 33210 in 74474 ms (445 rows / sec)
Parameters: 66420 in 140859 ms (471 rows / sec)
-------------------------------------------------
Total: 102951 in 223378 ms (460 rows / sec)

As you can see from the output, inserts into all three tables are about equally fast (in all cases about 1300 rows per second).

If you are interested in system load during the test (to determine performance bottlenecks - CPU / drives, etc.) you may use utility "dstat" - a typical example of the output is (just a few lines):

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
 53  21   9  10   1   5|8192B   19M|   0     0 |   0     0 |3882    12k
 53  21  11   8   2   7|   0    19M|   0     0 |   0     0 |4036    12k
 47  20   9  17   1   6|   0    20M|   0     0 |   0     0 |3670    11k
 57  21  11   5   0   5|   0    18M|   0     0 |   0     0 |3642    12k
 49  23   9   9   2   7|   0    19M|   0     0 |   0     0 |3983    12k
 50  22  10   9   2   7|   0    19M|   0     0 |   0     0 |3933    11k
 52  20  10   8   3   6|8192B   19M|   0     0 |   0     0 |4078    12k
 ....

The "top" command may be used to get information about each process:

top - 17:58:02 up  1:33,  1 user,  load average: 2.20, 0.75, 0.44
Tasks: 116 total,   4 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 53.3%us, 21.3%sy,  0.0%ni,  8.5%id,  9.8%wa,  1.8%hi,  5.3%si,  0.0%st
Mem:   2059928k total,  1406112k used,   653816k free,    55912k buffers
Swap:  4144760k total,      224k used,  4144536k free,   976444k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 9122 postgres  20   0 36196 5164 3540 R   44  0.3   0:31.22 postmaster
 9120 postgres  20   0 36168 5096 3484 S   44  0.2   0:31.26 postmaster
 9121 postgres  20   0 36176 5176 3548 R   41  0.3   0:31.84 postmaster
 9101 vampire   20   0  829m  35m 7972 S   22  1.7   0:20.42 java

You may see the 3 "postmaster" processes servicing the "executors" are using about the same CPU (41% - 44%), although the overhead of the Java process is not negligible (22%). That's why I recommend running this tool on a different machine than the database.

Comments

There are no comments for this article (or are awaiting acceptance).

New comment

All the comments have to be accepted, so there may be some delay between submitting and accepting (or rejecting) the comment. If you enter the e-mail address, you will be informed about acceptance or rejection.

Subject or body may not contain HTML tags - they will be automatically removed. Paragraphs may be separated using a newline (ENTER).

(optional)