SQL University – Capacity Planning Week
I am honored to be a guest lecturer this week for SQL University. There have been some amazing professors helping guide you through SQL Server this semester, and I hope that this week meets your expectations.
The topic for this week is capacity planning. While not a critical task for DBAs on a regular basis, the failure to properly plan for the resources needed can easily become a critical problem at the worst possible time: when your systems are busiest.
What is Capacity Planning?
Capacity planning involves making estimates of the resources that will be needed in some area or system and then ensuring that you can acquire those resources to meet demand. The resources can be time, money, hardware, people, or just about anything that is used in your environment.
The traditional method of capacity planning deals with the resource requirements that your systems need. For SQL Server this has primarily been CPU, memory, and disk resources for each instance. While this seems to be a fairly benign and perhaps boring topic, it has enough moving parts that it quickly can overwhelm a DBA tasked with planning the resources needed and budgeting for them appropriately. There are other potential issues, and some of those will be discussed on Friday.
As our systems become more complex, include more interconnections between them, and become more important, the ability to properly plan for, and meet, the demand for your services is an important part of the data professional’s job. Even if you have server administrators, SAN administrators and other professionals responsible for physical hardware, if your database is not available or does not scale, you will be blamed.
People new to managing database servers constantly ask the question “how big a server should I buy?” as if there is a standard size server for a particular number of users or databases. There isn’t one, and these factors don’t really apply to proper server sizing:
- number of users
- number of databases
- number of transactions
- number of records in your largest table
While these numbers do seem to increase as the load on your server increases, resulting in the need for more resources, these aren’t the way that you can size a particular server.
Ultimately for new systems you can only take a guess at what level of CPU, memory, and I/O you will need. Likely it will be a bad guess, and so most DBAs and server administrators try to buy the largest server hardware just in case database use is far beyond what they expected. Or they economize and wind up with a severely underpowered server and a large percentage of unhappy clients.
For existing systems, you should have some idea of the load being placed on your systems. From this load, you can extrapolate to make predictions about future loads and plan accordingly to meet the demand.
The only really good way to properly size a system is to extrapolate future needs based on the past usage of the system. For existing databases, it helps to have a process set up that determines a baseline of not only the performance of your server, but also the capacity peaks and averages over time. These will help you extrapolate for future demand on the database.
For a brand new database, you can only take a guess. As good as your guess might be, you will still likely be wrong, but having a baseline process set up at the beginning can help you quickly decide by how much your estimates are wrong and adjust the hardware accordingly.
There are a number of articles written, and any number of product to help with the setting up of a baseline for your servers. The important thing to remember is that this is unique to each server, each company, and each database. The way that Microsoft Dynamics runs on your server instance will be different from mine, even if we have the same number of users. Don’t examine the scores from hardware manufacturers or software vendors as a way of determining what capacities you need to plan for. Those are useful for comparing systems in a theoretical way, but you need something more concrete for your instances.
You need an affordable baseline system that you can deploy to all your server instances. I have had good luck with home grown systems at most of the places I have worked. I have also used commercial off the shelf packages (COTS) with as BMC’s Patrol and HP’s Openview to record data, but often had to extract that data to run my own extrapolation calculations. In SQL Server 2008, Microsoft introduced the Management Data Warehouse, which does a lot of this monitoring for you as well, if you have Enterprise Edition. Technet has a performance baseline white paper that gives some good basic guidance on what to monitor in order to measure the performance of your system.
The specifics of how you gather and store this information are not that important. Use whatever method is convenient for you. The important thing is that you keep this data over time. You do not need extremely detailed information over time, but rather broad averages that allow you to determine if your resource requirements are increasing.
I used to track the performance at 5 minute intervals for all server instances across a month, and average that down to hour increments for the past year. This easily allowed me to extrapolate for future planning by loading the data into Excel and performing a linear regression analysis. This will produce a line graph that best fits your data points.
Third Party Tools
As mentioned, there are a number of products that can help you with baselining and measuring the performance of your systems. Any tool that measures the various Performance Monitoring counters can be used if you can store those measurements over time. What’s Up Gold, Unicenter, even SQL Response (from my employer, Red Gate Software) can help you determine a baseline for your system.
While it can be a relatively simple task to build a monitoring system on SQL Server, there are a lot of factors to consider, and building a new piece of software might not be the best use of your time. There are often other servers in your environment that also need capacity planning. File server, mail (Exchange) servers, and more. Ask your system administrators if they have a system in place, or software available that can measure the load of your servers over time. It might make sense to use a system already available or purchase one rather than building your own.
That’s all we have for this lecture. Please feel free to chime in on the topic capacity planning and specifically any systems or software that you have preferred for tracking a baseline over time.
In the next installment of SQL University, we will examine disk capacity specifically as this is an area that most often administrators struggle with managing.
If you have other links, your own blog, or comments to add to this topic, please feel free to leave a note/link below.