Warning: all of these tests assume you have your server basically to yourself in order to get accurate results. If there are other people using the server at the same time, it will badly impact the results here, and you likely slow them down considerably. Also, make sure you have enough disk space available in the area you run the tests at; filling up your disk with test data and making other programs crash because of it is a bad scene I don't recommend.
2. Time how long it takes to write that many blocks and flush the data to disk like this:
# time sh -c "dd if=/dev/zero of=bigfile bs=8k count=_blocks_ && sync"3. Time reading that data off disk again:
# time dd if=bigfile of=/dev/null bs=8kFor our test system with 1GB of RAM this is easy: 250,000 * 1GB = 250,000 blocks. Here are the results:
-bash-3.00$ time sh -c "dd if=/dev/zero of=bigfile bs=8k count=250000 && sync" time dd if=bigfile of=/dev/null bs=8k 250000+0 records in 250000+0 records out real 0m47.961s user 0m0.086s sys 0m6.018sThat's 41.7MB/sec writing with 12.7% CPU utilization. Since there are 2 processors in this system, it's better to think of that as 25.4% of one processor. If we can write as fast the disks can keep up and are only using 1/4 of a processor to do it, that suggests the CPUs in this system should be able to keep up with an I/O bound load.
-bash-3.00$ time dd if=bigfile of=/dev/null bs=8k 250000+0 records in 250000+0 records out real 0m35.450s user 0m0.073s sys 0m2.128sThat's 56.4MB/sec reading with 6.2% CPU utilization
These results are pretty good for a single 7200RPM disk; 56MB/s read and 42MB/s write is certainly close to the maximum I/O you can expect one cheap drive to accomplish. Your results should scale based on disk technology and number of disks. For example, were this a RAID-0 volume with 4 disks in it, I'd want to see something closer to 200MB/s as a read result here.
What's nice about dd test results is that it's very hard to argue with them. If you tell your vendor "I only get 10MB/s of writes when I run a simple dd test with 8k blocks", there is not a lot of room for them to weasel out of that by saying your test methodology is unsound.
# ./bonnie++
Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
thud 2G 33324 60 46554 13 24155 5 47561 81 55243 5 182.4 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
thud,2G,33324,60,46554,13,24155,5,47561,81,55243,5,182.4,0,16,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++
Note that the block results here (46MB/s write with 13% CPU, 55MB/s with 5% CPU) are
essentially the same as the dd results above. For database use, those are the main
things that matter; it doesn't write on a character basis. We can create a fancy HTML version
of the output by pasting it back into another bonnie++ utility:
# chmod +x ./bon_csv2html # echo thud,2G,33324,60,46554,13,24155,5,47561,81,55243,5,182.4,0,16,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++ | ./bon_csv2html > disk.htmThat produces the following chart:
| Sequential Output | Sequential Input | Random Seeks |
Sequential Create | Random Create | ||||||||||||||||||||||
| Size:Chunk Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
| K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | |||
| thud | 2G | 33324 | 60 | 46554 | 13 | 24155 | 5 | 47561 | 81 | 55243 | 5 | 182.4 | 0 | 16 | +++++ | +++ | +++++ | +++ | +++++ | +++ | +++++ | +++ | +++++ | +++ | +++++ | +++ |
Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
slowserver 2G 20583 49 20830 9 10314 3 34440 72 57350 7 447.0 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 3021 97 +++++ +++ +++++ +++ 2938 93 +++++ +++ 7080 100
slowserver,2G,20583,49,20830,9,10314,3,34440,72,57350,7,447.0,0,16,3021,97,+++++,+++,+++++,+++,2938,93,+++++,+++,7080,100
| Sequential Output | Sequential Input | Random Seeks |
Sequential Create | Random Create | ||||||||||||||||||||||
| Size:Chunk Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
| K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | / sec | % CPU | |||
| slowserver | 2G | 20583 | 49 | 20830 | 9 | 10314 | 3 | 34440 | 72 | 57350 | 7 | 447.0 | 0 | 16 | 3021 | 97 | +++++ | +++ | +++++ | +++ | 2938 | 93 | +++++ | +++ | 7080 | 100 |
On the plus side, these disks do get a much better seek rating than my test server. This is a combination of them having a faster rotation speed (10K vs. 7200RPM), the fact that these SCSI disks are much smaller (seeks on a 36GB drive can execute a whole lot faster than they do on a 160GB one), and that seek reads may be getting split between the two drives in the RAID-1 volume.
You'll often find people recommending the LSI controllers for Linux SCSI RAID, and for good reason: they are very stable and reliable. They're just not fast for this application. This sort of thing is exactly why you need to run your own performance tests on your hardware. There are so many links in the disk performance chain, any one of which can completely destroy throughput, that it's the only way to make sure you're getting the end-to-end performance you expect. This isn't just limited to hardware. There are plenty of ways to screw up things like filesystem configuration, where everything from a bad journaling setup to poorly performing LVM software can just trash results from otherwise solid equipment. Test yourself, make sure you understand the results, and then when you run into a performance issue you'll be in a much better position to understand what level it's being introduced by.