-s scaling_factor: this should be used with -i (initialize) option. number of tuples generated will be multiple of the scaling factor. For example, -s 100 will imply 10M (10,000,000) tuples in the accounts table. default is 1. NOTE: scaling factor should be at least as large as the largest number of clients you intend to test; else you'll mostly be measuring update contention.That seems straightforward enough: set scale to be larger than the maximum number of clients you want to test against and you're done, right? If only it were so easy. Finding the right setting for the database scale is much more complicated than this; the size of the database and its indexes will determine whether you're testing memory performance, simple disk I/O, or disk seeking behavior. The best way to learn how this works is to watch how the pgbench results change as the size of the database increases, and that's what I'm going to lead you through here.
The basic pgbench tests don't care about effective_cache_size, because the sample database doesn't have any complicated indexes, so we can ignore that for now. The other initial change to the postgresql.conf I make that I'm not going to go into too much detail on yet is increasing checkpoint_segments; we'll get to tuning that properly later. Here's the total changes to the postgresql.conf file to get started:
shared_buffers=30,000 # approx 250 MB checkpoint_segments=8
After setting the parameters, start the cluster up and build the database using something like this:
pg_ctl start createdb pgbench cd contrib/pgbench make ./pgbench -i -s 1 pgbenchThat's a database with a scale of 1, which means 100,000 rows in the accounts table. We'll get our first test result like this:
./pgbench -t 2000 -c 8 -S pgbenchThis runs the select-only test with 8 clients. 4 clients per CPU/core is a reasonable value here to get enough clients to load up all your CPUs, which is how I got to 8 for the dual-core Athlon processor in the test server.
Here's a trio of results from running this three times. That's the minimum number of runs I'd suggest to consider the result significant; there is way too much variation in pgbench results (even if you run the test for a longer period) to just take any single result as useful. We'll only be considering the TPS figure with the connection established overhead taken out.
pgbench# vacuum analyze; pgbench# checkpoint; pgbench# SELECT relname, reltuples, pg_size_pretty(relpages * 8192) as size FROM pg_class ORDER BY relpages DESC; relname | reltuples | size -----------------------------------+-----------+------------ accounts | 100000 | 13 MB accounts_pkey | 100000 | 1768 kB pgbench# select datname,pg_size_pretty(pg_database_size(oid)) from pg_database; datname | pg_size_pretty -----------+---------------- pgbench | 18 MBNow, the thing to do here is to step up the scale while running this whole series:
|Scale||DB Size (MB)||accounts||accounts_pkey||tps1||tps2||tps3|
You can likely re-use much of this table for your own tests, because unless you're using a very different version of PostgreSQL the first four columns listing sizes will be the same each time if collected correctly; you'll just have different results for the three test results. Here's what that looks like as a graph:
This was generated by taking the middle of the three TPS values for scaling factor and plotting thost against the scale. This is generally a good approach for useful pgbench results: run three times, use the middle TPS number. To generate your own graph like this, you can use my sample data as a template and feed that into the gnuplot script that drew this graph.
What does it mean? Running a select-heavy test like this gives results that fit into three categories:
CPU/Memory bound: Scale 1 - 35, 18 - 512MB database. When the data is small enough to fit in the shared_buffer cache, your disk performance doesn't matter one bit to the results once everything is in there. And it's hard to even get the data out of memory; it's cached there when you do the writes to build the database, and even if your restart the server your operating system will likely have cached everything as well. You either need to reboot or do something drastic to clear out the database cache in order to actually test disk performance again, and even then the data will get read back very quickly and then you're back to testing memory operations again.
These results show we're memory bound right up to a scale of 35, and get 8500-10400 SELECTs per second in that range. The database size at that scale is 512MB. We have a 250MB shared_buffer cache, at least another 250MB sitting in the operating system cache, and the disk controller cache has its own 256MB of RAM. No wonder we can operate so fast on this data; the disk drives aren't even involved.
Disk I/O bound: Scale 40 - 75, 585 - 1095MB database. In this range, we can fit almost all the data we need in one of our bits of memory, but not quite all of it. There is a bit of disk shuffling going on sometimes, not so much that results are completely pulled down by it though.
Disk seek bound: Scale 100+, 1456MB or larger database. These results are from the area where only a fraction of the database can be fit into memory. The results are now in a similar range to what the bonnie++ test results benchmarked my maximum seeks per second.
The exact breakpoints here shift a bit if we use a different client load, but the basic curve doesn't change much. Where your breakpoints end up at very much depends on exactly how big all your caches are. There are a few important things to walk away with here:
Note how the curve spikes downward at 8GB the same way the smaller system dove at 1GB. Raw data for that example: