This article is the third in a series of four where I share my findings on baselining, monitoring, stress testing and performance tuning. It builds on the foundations of my previous articles on baselining and monitoring.
Having a monitoring system in place is an important step in making you feel comfortable with what goes on in your production system. But what about that new release that is in the pipeline to be deployed in 2 weeks? Your monitoring system will only tell you what went wrong after it is too late. Of course the end user has tested this new release already, and he or she is convinced that it is exactly what they need, and that they need it urgently. But what if this “miracle” release —which developers are probably still developing — has some hidden features like table scans, (dead)locking issues and more of these fun side effects? Can you predict how this new release will behave in a production environment after being deployed? If you can, please teach me, because I can’t.
I believe the only way you can minimize performance issues (because you’ll never succeed in avoiding them completely) is to test the new release in a “production alike” environment. That’s why we need to organize stress tests.
The question is how to generate a comparable amount of load on your system. Of course, you can mobilize everyone in your company to start simulating a user for a certain period of time. But what if your manager is not keen on this idea (and he probably isn’t)? What if you need to perform identical test scenarios because your first results were not perfect (what are the chances they are?). In other words, you need an automated tool to simulate a variable number of users.
In the ideal world, a stress test scenario should test your entire application architecture covering all business processes. Not only is your database subject to this test but also your network, all external systems and of course your application servers. This implies that your automated tests process should simulate a variable number of users on production-identical hardware, conforming to business specifications.
But, this “full option” stress test environment has a price label attached to it. Not only the investment in the identical hardware needed but also especially the investment in time and knowledge needed to setup these test procedures can become quite costly. Your business will want to see hard figures before they will agree to invest the money. How to define a business plan to justify these costs is far beyond the scope of this article but the following points of interest should certainly be in it.
- How business critical is your application? How much money will your company loose if your application has limited or no availability?
- What are the costs of additional hardware? Can this hardware be shared with other applications?
- What are the costs of a third party tool that can simulate virtual users?
- How many man-days will have to be spent on setup and maintenance of the environment?
If you cannot convince your business of the need of a full option stress test scenario, there is still plan B:
- Only the database will be stress tested.
- No third party tool required.
This is not ideal but focusing on the database for a stress test scenario is not such a bad idea. Your database plays a very important role when it comes to scalability requirements. Where a number of scale out solutions exist for your application farm (add an extra node to the farm and the load gets balanced automatically), scale out is seldom an option for your database, so you need to get it right from the start.
This is what I like to call plan B and this is what this article will cover. I’ll provide a self-made solution that, with its limitations, can give you a good indication of how your database will react, before it goes into production.
Also in this article, I won’t describe fancy third party tools but I’ll concentrate on Microsoft out-of-the-box tools like SQLIO coupled with a few lines of C#.
This document is applicable to SQL 2000 as well as SQL 2005 unless otherwise specified.
Step 1: Define Your Business Specifications
Before I start covering all the technical processes, I want to stress the importance of well-defined business specifications. Your business, although often the source of all evil, remains your customer who knows best what the requirements are of the product you are building.
You need to get reliable answers to questions like:
- What are your most critical business processes and how often are they processed (e.g. number of sales activities per day, number of client requests per day, etc.)
- What is regarded (required) as acceptable response times from the online application? How about reports?
- How much data must be kept available on the live database (over 1 month, 1 year, 5 years?). Can data be archived? How and when?
- How many concurrent application users are expected?
- Are there periods of more than average usage? (E.g. Daily between 9:00 A.M. and 10:00 A.M. /yearly ’round about Christmas.) What business processes will be impacted the most?
- How much data latency is acceptable for reports?
For a number of these questions, the service level agreement (SLA) you defined with your customer can be consulted. If there is still no SLA, this is the time to start making one.