Here is some file IO performance numbers from DELL MD1120 SAS storage array. Last year I did the same test with HP P800 storage array and numbers were impressive. But when it comes to this high end storage array, few surprises. Before getting into actual details; lets see the test stats and configuration details.
System Configuration:
- DELL R710 with CentOS 5.4
- NOOP IO Scheduler
- MD1120 with 22 10K SAS disks
- 20 disk RAID-10 (hardware)
- 2 hot spares
- Disk Cache disabled
- PERC 6/E RAID controller with BBU
- Connected to DELL MD1120 using SAS
- Write Back
- Read Cache Disabled
Test Configuration:
- Sysbench fileio test with variable modes and threads
- 64 files with 50G total size
- All tests ran in un-buffered mode (O_DIRECT) as most of the workload is InnoDB based.
Test Results:
Number of Threads vs Number of Requests/Sec. Every mode ran with 5 iterations and average is taken.
Random IO:
Sequential IO:
HDPARM Test:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
[test~]# for i in `seq 1 3`; do hdparm --direct -tT /dev/sdc1; done | grep Timing Timing O_DIRECT cached reads: 2068 MB in 2.00 seconds = 1033.21 MB/sec Timing O_DIRECT disk reads: 2146 MB in 3.00 seconds = 715.32 MB/sec Timing O_DIRECT cached reads: 2020 MB in 2.00 seconds = 1010.26 MB/sec Timing O_DIRECT disk reads: 2162 MB in 3.00 seconds = 720.62 MB/sec Timing O_DIRECT cached reads: 2052 MB in 2.00 seconds = 1025.90 MB/sec Timing O_DIRECT disk reads: 2128 MB in 3.00 seconds = 709.17 MB/sec [test ~]# for i in `seq 1 3`; do hdparm -tT /dev/sdc1; done | grep Timing Timing cached reads: 18920 MB in 2.00 seconds = 9475.34 MB/sec Timing buffered disk reads: 3442 MB in 3.02 seconds = 1141.44 MB/sec Timing cached reads: 19332 MB in 2.00 seconds = 9681.56 MB/sec Timing buffered disk reads: 3478 MB in 3.00 seconds = 1159.24 MB/sec Timing cached reads: 18012 MB in 2.00 seconds = 9019.50 MB/sec Timing buffered disk reads: 3492 MB in 3.02 seconds = 1155.53 MB/sec |
Analysis:
- Overall the numbers are not bad when it comes to writes, but few surprises when it comes to reads. When compared with HP’s P800 storage array, the numbers still dropped by 20%.
- Radon IO:
- Random write requests ranges from 3200-5000 per sec; due to write back mode (512M cache)
- Writes are linearly scaling well with the threads, good sign that controller is able to manage the cache efficiently
- Random reads and writes (rndrw) is also scaling linearly with the threads load, means the IO distribution and cache burst to satisfy reads seems be efficient as it needs to flush the data from controller cache to disk before the read can be satisfied.
- Sequential IO:
- Writes seems to be scaling well even in sequential mode without much overhead
- When it comes to reads, big surprise is drop from 5626 requests/sec to 615 from one thread to two threads. Which is really odd. Worst case it should be ~2000-3000 requests/sec; not sure where the overhead is. I can’t believe it could be thread scheduling as there is only 2 threads.
- During 100% IO, on and off I noticed IO serialization with higher queue waits, which indicates that there is some degree of serialization overhead in OS; but not able to track which layer is triggering this. Tried with cfq/deadline, still the same.
- Next attempt will be replacing 3Gb/s SAS to fiber channel HBA or 6Gb/s SAS (PERC H800) to see how it performs along with combination of HW and SW raid instead of only depending on controller.
I expect more throughput on the random read test. Were you using ext3 or XFS?
Its ext3
Overall reads are not efficient; which is a surprise. When I enabled Adaptive Read Ahead in the Controller, then it yields better results; but in that case; we have to disable read ahead in InnoDB; but read ahead on controller can satisfy fileio test, but might affect when InnoDB is in use.
The per-inode mutexes in ext3 limit pending IO requests per file. xfs does much better in this regard. Although you have many files so that should reduce the overhead assuming sysbench fileio request distribution is uniform. Regardless, I would like to see results for xfs.