Stress Testing

Many of the DBAs that manage production systems will at some point determine what level of hardware is needed to support a workload. Whether this is a physical server purchase or a cloud “rental”, someone has to decide what hardware is needed. How many cores, the amount of RAM, the number of disks, which hopefully correspond to some level of IOPs, and more. Even in the Azure SQL Database world, you must decide what database capacity you will pay for.

Since this is a big decision, and changes can be hard to make, many DBAs overbuy hardware. After all, no one wants to have a slow server. This is true for Azure as well, at least for many people I know. While changing from an S3 to a P2 is quick and easy in the Azure portal, it’s not such an easy sell to management. If they’ve budgeted $150/month and you tell them we want to go to $900/month, the technical change is the easiest part of this.

As a result, I’m surprised that we don’t really have better ways to determine if hardware will support our workload. I see this question asked all the time, and although there are tools and techniques suggested, I’ve yet to see many people have a set, known standard way of evaluating hardware and a particular workload.

One one hand, I think there should be better tools to do this, whether from Microsoft or someone else. I suspect since this is such a rare activity and businesses have been willing to overbuy hardware (or deal with substandard performance), that there isn’t any large impetus to solve this issue.

However I wanted to ask if any of you actually stress test hardware? Either your current hardware or new purchases. If you don’t know what your level your current hardware performs at, how do you compare that to new hardware?

Do you have a way to replay and measure a workload? Do you have the time to do so when new hardware arrives? Is there a documented method you use? Apart from discussing this today, I’d love to see some articles that detail exactly how you test hardware from a technical tool perspective, and then a followup that examines and evaluates the results.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.2MB) podcast or subscribe to the feed at iTunes and LibSyn.

What’s Your Downtime?

That’s the question this week: What’s your downtime?

I thought about this as I read this piece on Azure’s downtime being greater than its rivals in 2014. You do need to provide an email to read the article, but essentially the article builds on tracking from CloudHarmony to show that Azure had more downtime for the year, with about 54 hours. In Europe, that high water mark was 5.97 hours for compute and 1.31 hours for storage, so the European Azure cloud is doing better than the others.

That’s pretty good. Certainly individual machines went down, and services were unavailable for short periods of time during failover, but keeping a hardware platform up around 5 hours of downtime a year is good. I’m not sure that many of my companies have done this, though to be fair, mostly it’s been patches from Microsoft that caused downtime for Windows machines.

However, let’s look at your database instances. Do you know what your downtime is? I’d encourag you to track it, and be sure that you report on it. Or at least have the data, in case the boss asks. I don’t know what our SSC total downtime is, but I can tell you that our cluster rebooted in Aug 2015 and Mar 2016, brief outages for some patching. In the past, I’ve typically seen our database instances run the better part of a year, usually only coming down for a Service Pack.

If you have monitoring in place, and you should, then I hope you have some idea of downtime for your main databases. If you can share, see if you might set the high water mark for the SQLServerCentral community in the last year.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.4MB) podcast or subscribe to the feed at iTunes and LibSyn.

Restoring a Striped Backup–#SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

Recently I uncompressed some SQL Backup Pro files. Since multiple threads were used to make the backup, I ended up with multiple files in my file system, as shown here:

2016-02-16 11_58_57-Backup

Each of these is part of a striped backup, a piece of a backup file. To restore the backup, I need all the files to reassemble the backup. This is fairly simple, but you should be aware of how this works and how to perform a restore.

In my case, you can see I have 7 files for each database. They are the same name with an _0x at the end, with x being the number of the file.

To restore these, I go to SSMS and click the restore choice. That gives me my restore database dialog, where I can select that I’m restoring from devices. As you can see below, no devices (files) are selected.

2016-02-16 12_00_56-Restore Database -

I then click the ellipsis to get a dialog allowing me to add files.

2016-02-16 12_01_03-Select backup devices

Hopefully you’ve done this before, and you can click “add” to add files. You need to naviate to the location of your backup files if it isn’t the default.

2016-02-16 12_02_02-Locate Backup File - JOLLYGREENGIANT_SQL2014

Next you can select the files. Holding down CTRL, I can multi-select files.

2016-02-16 12_02_12-Locate Backup File - JOLLYGREENGIANT_SQL2014

Once I pick them, I click OK and then I see them all in the device dialog.

2016-02-16 12_02_19-Select backup devices

Now I click OK and SQL Server reads the headers, and I end up with a single database to be restore, as shown below.

2016-02-16 12_04_24-Restore Database - SQLServerCentral

Now, I can click OK, and often do in development areas. HOWEVER, if you are doing this in production, please, please, please, click the Script button instead. You’ll get a new query window, and you can cancel out of this dialog to see the code.

2016-02-16 12_16_07-SQLQuery7.sql - JOLLYGREENGIANT_SQL2014.master (JOLLYGREENGIANT_sjones (59))_ -

From here, you should click “Save” and save this, then execute it.

As you can see above, the statement is simple. List each disk file, separated by a comma. Then the rest of the RESTORE statement is a standard restore.

SQLNewBlogger

This is a fairly simple task, one I’ve done dozens of times, but it’s worth practicing. If you want to write about it, what happens if you’re missing a file? What if you change the order of files? This was a 10 minute blog, and it’s a good chance to play and practice your restore skills, which are as important as anything you do as a DBA.

Creating a Database Snapshot

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

I’ve rarely dealt with database snapshots, outside of lab experimentation. They didn’t exist when I did most of my DBA work, and since then we haven’t seen the need for them at SQLServerCentral, though, I may suggest we add them to our deployment process since we can quickly roll back if something breaks.

However, I created one recently for a test and realized that I’d forgotten the syntax. Unlike the quick "create database xx" syntax I often use, with a snapshot I need to be more specific.

The big item that I must specify is the location of the snapshot file. This is the file that gets written to with the copy-on-write process that ensures the original state of the pages when the snapshot is created are still available.

You do need to give the database a logical name as well, which can be anything, but the reference below has suggestions. I’d say that this is bad:

CREATE DATABASE sandbox_snapshot
ON (NAME = MySnap
, FILENAME = ‘h:\SQL Server\MySnap.ss’
)
AS SNAPSHOT OF Sandbox

But this is better

CREATE DATABASE sandbox_snapshot_20150122_1345
ON (NAME = SandBox_Snap_20150122
, FILENAME = ‘h:\SQL Server\MySnap.ss’
)
AS SNAPSHOT OF Sandbox

Because the snapshot is based on an existing database, at a particular point in time, it’s useful to specify the time when the snapshot was created, if possible. While you can get this from metadata, if you have people that look to multiple snapshots for information, it can be handy to know when each is from.

However if you are providing this on a schedule, like daily, for people to report from, you might need to have the same name every day. Think about this, as you cannot rename a snapshot once it’s created.

SQLNewBlogger

When I realized I had to lookup the syntax, I took a few notes and captured code, which meant I combined this writing (< 10 minutes) with other work I was doing.

You should do the same. When you tackle something new, take screenshots, save code, and drop it in a OneNote/EverNote/etc notebook for your weekly blog writing.

Reference

The references I used:

Disabling SQL Server Network Protocols

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

I ran across a question on network protocols recently, which is something I rarely deal with. Often the default setup for SQL Server is fine, but there are certainly times you should add or remove network connectivity according to your environment.

Here’s a short post on turning off (or on) a network protocol for SQL Sever.

What’s Enabled?

The easiest way to verify what’s enabled is to use the SQL Server Configuration Manager. You’ll need administrative permissions on the host to run this, but it’s easy to find.

2016-01-13 14_59_40-Start

Once you open it, typically you’ll have a list of the items that can be configured.

2016-01-13 15_02_09-Photos

We want the SQL Server Network Configuration, which is the server level configuration for this host. The Client configurations are for the host being used a client to connect to a SQL Server.

2016-01-13 15_02_31-Photos

As you can see here, I have Shared Memory and TCP/IP enabled for this instance, but Named Pipes disabled.

Disabling a Protocol

As you might expect, this is easy. I right click on a protocol, and I can change the status. In this case, I’ll disable Shared Memory

2016-01-13 15_03_50-Photos

Once I do that, the protocol is disabled. However not on the instance. I’ll get this message.

2016-01-13 15_04_56-Photos

I need to restart the server. Once that’s done, no one will be able to use Shared Memory on the host.

I can fix this

2016-01-13 15_04_49-Photos

Of course, I need to restart my instance again.

Checking the Log

When SQL Server starts, quite a bit of configuration information is written into the log. This is useful for troubleshooting in many cases. One of the things you’ll find is the network listeners, as shown here.

2016-01-13 15_08_14-Log File Viewer - JollyGreenGiant_SQL2016

This is usually after the database startup information, so if you look, you can see I have some local pipes and some TCP/IP settings here.

SQLNewBlogger

After reading a question, this was less than 10 minutes to write, with making screenshots. However I’ve done this before. If this was your first time, then it might take you longer to research and write, but I bet most DBAs could do this in 30-45 minutes.

Tracking Logins with Extended Events

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as#SQLNewBloggers.

I was building a question the other day on Extended Events and needed to build a new session. One of the easier sessions to build is with the logins, so I decided to use that, and then wrote this post on how to build the session.

The first thing is to to to the Management tab in SSMS and then the Extended Events folder. I right click the Sessions folder and select New Session.

2015-11-18 10_40_06-Cortana

This gives you a dialog. Like many in SSMS, we start with a name. You could select run at server startup, which I’d do in the case of this being a production system.

2015-11-18 10_40_55-Photos

When I click next, I get to the "Events" tab, which lists all events. I’ll scroll down to login and select that. I need to click the arrow to the right.

2015-11-18 10_41_12-Photos

Once I do that, my event is in the session.

2015-11-18 10_41_19-Photos

After I pick the events, I choose the fields I’m going to capture. There is a "Configure" button in the upper right that you click. This scrolls the dialog over.

2015-11-18 10_41_38-Photos

I can select any number of fields for capture. You can see I’ve picked the client_hostname. I would also add the NT_Username and Username from the list. You could add more, but in this case, I’m more concerned with seeing who’s logged in.

I could add filters, but I choose not to. I click on Data Storage to determine where to store this data.

2015-11-18 10_45_59-Photos

For auditing, I might want a file. In this case, for testing, I’ll use the ring buffer, in memory storage.

2015-11-18 10_46_15-Photos

That’s it for getting the session set up. However it’s not started. To do that, I need to right click the session and select Start.

2015-11-18 10_47_34-Start

This will allow the server to start collecting data. Is it working? Let’s see. We’ll watch the data. Right click the session again and select Watch Live Data

2015-11-18 10_47_43-Cortana

This pops open a window. I usually make this a separate vertical tab group. Once that’s open, I’ll click "New Query" in SSMS, which will log me in again

2015-11-18 10_48_05-Photos

As you can see, a few events popped up here. I am capturing data. Don’t forget to stop the session after this if you don’t need it.

SQLNewBlogger

This post came out of work I was doing, and which I’d likely do as a DBA. However as soon as I got things working and tested, I knew this was a good post. In fact, I got a couple posts from the process. The setup and testing took about 20 minutes, including a little research. However the writing for this was about 10 minutes.

References

A few things I used.

The Writeable Warm Standby

I saw a question recently that went like this: I get one full backup from FTP. I’ll get daily log backups through FTP after this, but never another full. I need to restore this daily log backup and allow the group receiving the database to read/write the copy, and then reset it with the new log backup overnight.

First, this is untenable. At some point you’ll have some issue with transfer, lose a log, or the database will go corrupt. I can guanantee you that at some point you will need another full backup. Not every week, or even every month, but you will need one.

Second, this is a tough situation. I saw some answers, which I agreed with, but I started thinking about ways to get that data moved. My first thought it use STANDBY and move the data every day to a clean database. I’ve done this before, and in the GB range, even 100s of GB, this can work. It helps if you can whack indexes and constraints on the destination, but a copy of data table-by-table goes fast.

However then I thought about other ways. You can’t take a backup of a standby database, nor can you take a snapshot. However while searching, I saw an answer to this post on SO.

TL;DR: copy the mdf/ldf to a new database.

That was interesting, so I decided to test it. Turns out, it works pretty well. HOWEVER, it’s dangerous, and I think you should be very careful about this. I wouldn’t count on this being production stable, and certainly not data stable. You better have other copies of this data.

Here’s what I did. First, create a database with some data.

CREATE DATABASE mydb;
GO
USE mydb;
GO
CREATE TABLE mytable(id INT);
GO
INSERT mytable SELECT 1;
GO

Next, let’s back this up, take if offline, and then copy files.

USE master;
GO
BACKUP DATABASE mydb
 TO DISK = 'mydb.bak';
GO

Now we can copy the files to new files. There are UAC issues here, so you’ll need to add some rights to the files if you do this regularly. I left a comment in my script, but I actually did a CTRL+C,CTRL+V in the data folder, allowing the UAC permissions to work.

2015-12-11 08_42_22-Photos

Once that was done, I had new files:

2015-12-11 08_28_15-Photos

I tried to run at attach, but got a permissions error:

Msg 5120, Level 16, State 101, Line 30
Unable to open the physical file “D:\SQLServer\MSSQL11.MSSQLSERVER\MSSQL\DATA\mydb_reporting.mdf”. Operating system error 5: “5(Access is denied.)”.

The solution I found was to run SSMS as an administrator. Annoying, but it works.

At least for the OS error. However, then you get this:

Msg 1824, Level 16, State 1, Line 35 Cannot attach a database that was being restored.

You can’t do this. At least not easily.

You can do this. First, delete the files you copied over, then run this:

CREATE DATABASE mydb_reporting
go
alter database mydb_reporting set offline;

The next step is to delete the MDF and LDF files, which will be mydb_reporting.mdf and mydb_reporting_log.ldf  by default. I could specify other names, and would if this were something I needed to script.

Once those files were deleted, I’d next copy my files again and rename them. That would result in this:

  • mydb_reporting_base.mdf –> mydb_reporting.mdf
  • mydb_reporting_base_log.mdf –> mydb_reporting_log.ldf

Now I can go back to SSMS. In SSMS, I do a simple ALTER.

ALTER DATABASE MYDB_Reporting SET ONLINE

Then I can run this:

2015-12-11 09_03_28-Start

I have a copy of my database. Can I apply logs and move forward? Let’s try. First, let’s add data and make a log backup

USE mydb
GO
INSERT mytable SELECT 99
GO

BACKUP LOG mydb TO DISK = 'mydblog.trn'   
GO

Next we restore again. We also set the databases offline again.

RESTORE LOG mydb_reporting_base
 FROM DISK = 'D:\SQLServer\MSSQL11.MSSQLSERVER\MSSQL\Backup\mydblog.trn'
 WITH  MOVE N'mydb' TO N'D:\SQLServer\MSSQL11.MSSQLSERVER\MSSQL\DATA\mydb_reporting_base.mdf'
    ,  MOVE N'mydb_log' TO N'D:\SQLServer\MSSQL11.MSSQLSERVER\MSSQL\DATA\mydb_reporting_base_log.ldf'
    , STANDBY = 'D:\SQLServer\MSSQL11.MSSQLSERVER\MSSQL\Backup\undo.log';
GO
USE master
GO
ALTER DATABASE mydb_reporting_base
 SET OFFLINE;
GO
ALTER DATABASE mydb_reporting SET OFFLINE
GO

Once again it’s file copy time. The UAC comes into play again as I copy and rename the _base files. However once that’s done, things work.

2015-12-11 10_17_02-Photos

This works, but I am not recommending this as something you should do, especially for critical systems. This can work, but it’s dangerous, and really not supported by MS.

Virtual Failback

I saw a very interesting tweet from Scott Stauffer asking this: “… looking for quick’n’easy way to make workload virtual w/ fail-fast method to physical if performance doesn’t meet expectation.” This was part of a conversation Scott was having in looking to move to a virtual environment from his current physical one. That’s something I think more and more people are being asked to do, whether they do in inside of their own data center, or they move to some type of hosted or cloud environment. There is increasing pressure from management to consider using cloud-type environments to reduce the capital expenditures for new systems and move to an operating cost model.

I don’t have a fundamental problem with cloud environments, though I think it is important to carefully consider the pros and cons, but I can certainly appreciate Scott’s concern. No matter how well we architect things or prepare for the movement of a physical environment to a virtual one, there could be problems. Having a fall back plan becomes important, and even more important if we discover problems when some time has passed.

While there are utilities that can move a physical machine to a virtual environment, there aren’t any (or any I know of) to reverse the process. Honestly, though, I think virtualization has so many advantages, that if I really had performance issues and needed to return to a physical host, I’d continue to virtualize my instance, but I’d have only one VM on my physical host, with access to almost all the resources on the hardware. Today’s hypervisors have so little overhead, I wouldn’t hesitate to run one virtual machine on a host.

Ultimately, moving to a virtual environment is very much like moving to new hardware. There are definitely different configuration options you may need to set, but you can contract for some help with configuring your system. In the worst case, just use a single VM on a host, get hardware abstraction, and manage the machine like any other. Just don’t forget to have the hypervisor and your guest start up automatically after a reboot.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.0MB) podcast or subscribe to the feed at iTunes and LibSyn.

Learn to Use Filegroups

This editorial was originally published on May 12, 2011. It is being re-run as Steve is away at SQL Relay today.

In SQL Server, filegroups are a management technique that I don’t see many people using. It’s amazing how many people ask questions about filegroups on the discussion forums, often unsure of how they fit into a well architected SQL Server. I have tended to use filegroups mostly as a space management technique, when I need to add more disks to my server, but they can be used in many more places.

We continue to grow our data sizes all the time. While many databases are still measured in the single digits of gigabytes (or smaller), it is fairly common to find many database servers with over a terabyte of disk space. Our disks grow larger and larger, but it seems that data grows faster than disks, requiring larger storage subsystems all the time.

While our storage grows larger, the tolerance for delays shrinks and demands for better performance increase. That means that data professionals need to be more cognizant of not only how their code is written, but also how they design storage. Tiering storage is one idea that I think has merit, but one that requires some planning.

In SQL Server, we can’t split a table across filegroups. Or can we? We can partition a table (Enterprise Edition and higher), which can allow us to manage performance and storage appropriately. There is also the recommended practice of only having system objects in the primary partition and using separate filegroups for user data. That allows you to bring a partial database online, again, in Enterprise Edition only, while you restore different filegroups.

This isn’t the first thing I would recommend you learn about SQL Server, but as you advance your knowledge, you should better understand when and how filegroups can help you. You will use them at some point and being comfortable with a filegroup restore is one of the skills that separates the accidental DBA from the data professional.

Steve Jones

The Work of the Ancients

I was reading a post from someone recently where they noted that they didn’t worry to much about the architecture of the system since it wouldn’t likely last very long. The poster had a comment that many systems are replaced inside of a few years.

In my experience, that’s not usually the case. In fact, while I don’t expect many applications I’ve worked on to last for 25 years, I suspect many of them will exist for ten years or more, especially if they are receiving regular development resources. With that in mind, I wanted to see how your databases are faring these days. I suspect a database might last longer than a particular application, as it seems most organizations are loathe to ever let data go.

What’s the age of your oldest, regularly used database?

I’m looking for an age in years. If the answer is less than one, I’m not sure I’d call that old at all. I am sure many of your  systems are older, and might have changed, but let us know the year when the system went into production.

I can tell you the SQLServerCentral systems are old in some ways, not so old in others. We’ve grown from one database to three over the years. The oldest database is circa 2003. Some of the data from that one was migrated to other databases around 2007. We’ve got data in the system since 2001, but we’ve certainly changed structures and storage over the years.

I’d guess that most of you that are working in companies that are older than ten years will have a database that’s at least that old. However let us know this week, and if you have any interesting notes, feel free to share them.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.4MB) podcast or subscribe to the feed at iTunes and LibSyn.