As mentioned in the first I/O and YOU post there is more to SQLIO (disk subsystem benchmark tool by Microsoft) then what was covered. This post will go into a bit more detail with the tool and explain some terminology used when troubleshooting I/O performance.
SQLIO and how to get the most out of it:
If you go back to the first blog post, follow the instructions and run SQLIO, you’ll likely get a good amount more detail then what was covered, so here is what it means to you:
After you allowed SQLIO, we now have a text document that looks really fancy. The first block of text you’ll see will look something similar to this:
This will give you the information you’ll need to know about the test you’re running.
The first two lines simply state that it is reading the parameters stated in the param.txt file you edited in part one.
Lines 3 and 4 create the file that will be used to simulate data being read and write to your hard drives the number of threads (CPU cores that the test will use).
Five and 6 tell you how long the test will run and exactly how the data will be read or written. This is a pretty important piece of information. The test includes several different types of block sizes, ranging from 8kb all the way up to 256kb, randomly or sequentially read or written to the drives. The graph below will help explain a couple of the block sizes. Your system will use different block sizes depending on what it is being tasked to do at that time. I won’t go into detail about each, but smaller block sizes are usually in OLTP databases – small database commits. The larger block sizes are bulk backups and the like.
Block Size | Random / Sequential | Write or Read | Explanation |
8kb | Random | Write | You and 19 of your friends go on a trip that requires you to take a plane, but your tickets have you all sitting in different spots of the plane. |
8kb | Sequential | Write | You’re going on the same trip with the same friends, but now everyone is sitting together from seat A10 to B29. |
256kb | Random | Write | The following year, you and your 19 friends have all gained a significant amount of weight, but want to go on the same trip. Now you each have to purchase two tickets to be seated comfortably. However you are all once again spread throughout the plane. |
256kb | Sequential | Write | You get the idea… |
8kb | Random | Read | Once you land you and all your friends need to get off the plane and onto a shuttle to take you to the terminal. The shuttle can’t leave until you are all on the shuttle. This will take longer since all your buddies are spread throughout the plane. |
8kb | Sequential | Read | Everyone is sitting together and the steward eagerly rushes everyone off the plane and onto the waiting shuttle. |
256kb | Random | Read | Due to you and your friend’s size, it will take a bit longer to get off the plane. |
256kb | Sequential | Read | Everyone is sitting together, but again you’re still taking a few extra milliseconds to grab your belongings and get off the plane. |
Now that I undoubtedly confused you even more. I think it’s time to move on to the last few lines to explain the total size of the file that is going to be created and where.
The next bit of information is the actual results:
We’ve already gone over the latency section in the first I/O and YOU blog. So I will skip over that. The other pieces of information are just as important, if not more important.
I/O’s/sec is the number of input and output operations that occur per second (IOPs). IOPs are generally determined by block size and the amount of data that has been touched, looked at, or moved in a given amount of time. You’ll often hear that a SQL server will need 20,000 total IOPs to run an application properly or Exchange requires 8,000 IOPs to run adequately.
TIP: If you’re looking to purchase an ERP system or a large application for your company. Ask the vendor how many IOPs they recommend. They should be able to give you a ball park. Then you can use this tool to help determine if you’re infrastructure is capable of properly running the software.
MBs/sec is more self-explanatory. This simply means how much data is pushed through the system in a given time. You’ll often hear this referred to as throughput. Depending on the test being ran this may vary wildly. This is due to data not being stored sequentially; the system will need to hunt down the correct blocks of data. Or, if you’re trying to store large blocks of data sequentially, the system will need to find a room large enough to store everything together. This all creates latency and slows the system down.
The final piece of information is the histogram. You’ll also want to see the numbers toward the left. If you have a high percentage to the right, that generally means you got some latency somewhere in the system.
This is just the tip of the iceberg when it comes to troubleshooting and understanding I/O in your infrastructure. There are hundreds of other tools out there that can help you determine if your system is having issues or running smoothly. Some are free like SQLIO and others you’ll need to pay for, but offer a great way to deep dive into your infrastructure’s backend. If you’re environment or application isn’t running like you think it should, try some of these steps or contact us.