Benchmark results with SSD - read-only pgbench

In the last few posts I've discussed benchmark results with a traditional SATA drive, now it's time to discuss results of the same tests with a SSD drive (Intel 320). This post is about results of the read-only pgbench test. As expected, the SSD drive performs better than a traditional drive in all three tests, but it's interesting to see how the performance boost varies for various tests and how perfectly are eliminated differences between the file systems.

The average results for all file systems looks like this

average read-only performance with SSD

The SSD results are available here, comparison of read-only results is here.

Comparing the performance with performance with a traditional hard drive

average read-only performance with a traditional hard drive

it's obvious that the performance with SSD significantly grows, from about 270 to 7000 tps, i.e. almost 25-times. The range of block sizes with the best pefromance shrunk at the same time - while with a traditional drive the best peformance is achieved for blocks between 1kB and 4kB, with SSD 4kB blocks perform significantly better than the other block sizes (the difference is at least 40%).

The second interesting observation is that all file systems share this behaviour (best performance about 7000 tps is achieved with 4kB blocks). For example XFS and nilfs2 behave like this

read-only performance with XFS and SSD drive

read-only performance with nilfs2 and SSD drive

The differences are very small, although some differences are a bit more visible. For example nilfs2 achieves the best performance (of all the file systems) for 1kB file system blocks and 4kB database blocks (but even this difference is less then 15% compared to the average).

My theory is that with SSD the I/O impact significantly decreased and thus the efficiency of processing the blocks increased - coincidentally the 4kB is exactly the size of memory page on the x86 platform. I wonder if this "4kB optimality" would be true for traditional drives with a lot of RAM (enough to hold the whole database) or a huge cache on the controller.

The tps course is interesting too - at the first sight it's obvious the gradual run-up (that might take up to 5 minutes with a traditional drive) is completely gone.

tps course with XFS during a read-only test

Another interesting thing is that while with 4kB blocks the performance is quite stable, with smaller blocks it gradually decreases although it's almost as good as with 4kB blocks.

tps course with XFS during a read-only test with 2kB blocks

A latency chart is interesting too - during the warmup (first 5 minutes) it's significantly lower than during the actual test (second 5 minutes). The only thing that is different for those two separate pgbench runs is that warmup is performed with 10 clients while the benchmark uses 20 clients.

latency during a read-only test with ext4/ordered

I.e. although the total performance does not change, the latency logically increases. The difference between 10 and 20 clients can be seen with some of the runs - for example the fastest run (already mentioned nilfs2 with 1kB file system blocks and 4kB database blocks)

tps during a read-only test with nilfs2

Database cache hit ratio is interesting too - it depends just on the database block size and the range is about 60% - 70%, just like with a traditional drive.

database cache hit ratio with a SSD drive

The small differences are probably caused by a slightly modified random_page_cost value (decreased with the goal to prefer SSD random I/O operations).

The balanced behaviour of all file systems could be caused by 100% CPU utilization or some other bottleneck, so let's see some iostat values - disk utilization (util%), iowait and CPU utilization (i.e. 100% - idle).

CPU and drive utilization during a read-only test with a SSD drive

Obviously the CPU is almost 100% utilized (with 10 clients the utilization is almost 92%, with 20 clients it's almost 100%), but most of the time (about 80%) is spent on iowait, i.e. waiting for IO operations. The drive utilization is about 100% all the time.

I dare to say the I/O remains to be the bottleneck even for SSD. For comparison let's see the same chart for a traditional drive

CPU and drive utilization during a read-only test with a traditional drive

Obviously the CPU load with a SSD drive is significantly higher (but that's somehow expected doe to the higher performance).

Conclusion

  • The difference between various file systems are very small.
  • The dependency on file system block size is negligible.
  • The best peformance is achieved for 4kB database blocks.
  • The CPU load is significantly higher than with a traditional hard drive, but I/O remains to be the main bottleneck (although with a slower CPU or a CPU with a lower number of cores, this may not be true)

Comments

SSD endurance

Thanks for some great benchmarks and analysis. Very interesting to see how big a difference a SSD actualy has on the performance.

I am wondering if you could do some analysis about the write endurance of the Intel 320 when used with postgresql and loads like this. It's specified to endure 15TB of 4kb random writes, but I wonder if postgresql with it's background writer actually writes in a pattern more acceptable for the database.

RE: SSD endurance

I haven't planned the write endurance and I'm not sure how to test it. But if you can point me to some instructions I'll try it.

New comment

All the comments have to be accepted, so there may be some delay between submitting and accepting (or rejecting) the comment. If you enter the e-mail address, you will be informed about acceptance or rejection.

Subject or body may not contain HTML tags - they will be automatically removed. Paragraphs may be separated using a newline (ENTER).

(optional)