T-SQL Tuesday #80–Chris Yates’ Birthday

tsqltuesdayIt’s the 80th version of T-SQL Tuesday. You can see all the parties from the past on my page, but feel free to jump in here or elsewhere. This is a monthly event, where anyone can write on a topic.

This month’s invitation is from Chris Yates, with today being his birthday.

Happy Birthday, Chris!

The topic is open this month, so here we go.

SQL Server Isn’t Easy Enough

I think that SQL Server is a fairly easy system to work with. I first started working with SQL Server on OS/2 1.3, and had come from working with lots of Unix, DOS, and Netware command lines. Windows 3.1 was still new, and we thought all the various command line tools for SQL Server were great.

Even as I continued on in my career, working with Oracle on multiple platforms, DB2, and other relational systems, SQL Server was the easiest to use.

However, I still find some simple, easy things that aren’t easy to handle. I think there is work being done with some of these items in Redmond (nothing specific I know or can say, but just a feeling), but these items have been an issue for some time.

Database and Log Backups

One of the really simple things many SQL Server professionals know is that we need to schedule at least one full backup to run, and then regular log backups, at least by default. The alternative is to set the database to simple recovery (not the default).

The problem is that for many installations, people don’t realize that they need to do this. While it’s easy to say they should learn this, it’s also silly. This could be easily handled with software setup. Create a default backup setup for each new database, as part of the database creation wizard.

I’d really like to see these defined in the database themselves, and loaded into the instance from the database, and hopefully we’ll get there.


I don’t have specific complaints here, though I could come up with some. The real issue I see is that replication is very brittle. The way replication works, and it’s barely changed in a long time, is that it requires lots of setup, and isn’t tolerant of issues. Far, far too often the solution when deploying changes to the database, or troubleshooting issues is to drop replication and re-enable it.

Most people running replication have been burned and end up scripting everything as soon as it’s set up.

This shouldn’t be that hard, or at the very least, some work should be done to improve the robustness of replication.


There could be other things, but I’ll stop here, and see what else people say about SQL Server this month.

SQL Server 2016 is Just Faster

I’ve enjoyed the series of posts from Microsoft on the ways in which SQL Server 2016 was improved and is faster. Some of these posts show improvement without any change in your code. Some showcase features that might require change, but will bring about improvements, and after all, wouldn’t you rewrite some code if you knew you could get performance improvements? Most of us would, if the changes are fairly simple. I know some changes are hard, but that’s our job as developers: make systems run better, even if it’s work to change our code.

Some posts are just fascinating. The one on log stamping, which highlights something I had no idea about. I wouldn’t think that stamping 0xC0 is better than 0x00, but I’m glad someone dives deep into hardware changes like this. I know many of us don’t have more than 8 CPUs running our databases, but we might at some point. Having a better way of ensuring soft NUMA works well automatically is good. Even if you don’t care about how this works from a practical standpoint, it’s an interesting look at how Microsoft is keeping up with hardware advances where a single CPU these days can encapsulate more than 8 cores.

SQL Server 2016 is not only the best version to date, but it’s also the most tested and evaluated. Apart from all the evaluations and writeups performed on the CTP 2.x’s , CTP 3.x’s, the RC’s, much of the code has also been running in Azure, where metrics have been gathered and evaluated for quite some time. There’s at least one feature I’ve been hearing about for over 3 years, since well before SQL Server 2014 was released, at which time it was running in Azure already. In all that time, there have been lots of tuning and tweaking to ensure that the code is running as efficiently and effectively as possible for this release.

Note that I’m not saying that the product is without bugs. While some people think there’s no need to wait for SP1 (I agree), I also think you should test thoroughly for your situation. It is entirely possible that there will be some bug that affects your code. And you need to test with actual systems and code. Also, be aware that there are documentation updates being made constantly, as we find that bugs also exist in BOL.

While SQL Server 2016 is faster than previous releases, there are quite a few new features that do require code changes. However, if you can take advantage of those features, I think it’s well worth the development effort and the upgrade cost. Let me know what you think.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( MB) podcast or subscribe to the feed at iTunes and LibSyn.


Stretch Pricing

I have to admit that I was really excited about the Stretch Database feature in SQL Server 2016. This will automatically archive older data away from your database, but let the query optimizer go get the data if needed. That’s outstanding. That’s the type of archive solution I’ve struggled to implement in the past, finding the effort complex and requiring application change or active DBA management. Often I’ve found that only 10-20% of the data in my database was accessed often, and the rest relatively rarely. Maybe old data was queried every day, but still somewhat rarely compared to a small percentage of the data.

Then the pricing for Stretch was released, and I think it’s definitely aimed at the Enterprise. If you stretch a sales database, say a 100GB database and want to move 60GB of that data away, you’re going to pay at least US$930/month for the compute at the lowest performance level. Regardless of whether anyone queries the data. If you want better performance, you can run up in roughly multiples of that amount ($1860, $2790, $3720, etc), however, that’s just for compute. If you add in storage, and you must, it’s a minor cost even 1TB, but still, having a $1k bill for access to archive data, especially when you might find people make the mistakes and do query cold(er) data might seem like a lot for a small or medium sized business (SMB). If you have to get better performance, you’ll pay more per month.

I’m not the only one that doesn’t love the pricing of stretch. It seems to me that the pricing very much favors the Microsoft share price more than the value of my own business. Perhaps this makes more sense at an enterprise level where storage costs can be high, and separating out older data could result in savings. However for smaller companies, if you’re running a SQL Server, even a 1TB (or 10TB) database, is the addition of another 1TB of storage going to cost much? I’m not sure it does.

Of course, there are other factors. Less data should mean much better performance from your local system. With some tuning of the feature, I would bet that plenty of people might be able to get 90% of their queries satisfied by on premise resources, when they have a substantial amount of older data stored in Azure. That’s not something I can easily do in my own archival system, or with the addition or more storage.

What I’d really like to see is a stretch to another SQL Server feature added. I’m guessing we’ll see that, likely in SQL Server 20,18 as I don’t think this would be hard to implement. However since this looks like a cash cow for MS, I bet when we get stretch to an on premise SQL Server, this will be an Enterprise only feature, once again, ignoring SMB needs and desires. Maybe in 2020 or 2022 we’ll be able to stretch on premise at a reasonable cost.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( MB) podcast or subscribe to the feed at iTunes and LibSyn.

It’s 2016 RLS for T-SQL Tuesday #79

tsqltuesdayIt’s T-SQL Tuesday time again. I missed last month, being busy with travel, though I should go ahead and write that post. Maybe that will be next week’s task.

In this case, Michael J Swart is hosting this month’s blog party and he asks us to write about something to do with SQL Server 2016. Read the rules at his invitation.

Row Level Security

I’ve wanted this feature to be easy for a long time. In fact, I’ve implemented a similar system a few times in different applications, but it’s been a cumbersome feature to meet, plus each developer needs to understand how the system works for it to work well. Even in the case where we once used views to hide our RLS, it was a performance issue.

Microsoft has made things easier with their Row Level Security feature. This was actually released in Azure in 2015, but it’s now available in SQL Server 2016 for every on premise installation as well.

Essentially for each row, there is some data value that is checked to determine if a user has access. This doesn’t mean a join. This doesn’t mean you write a lot of code. The implementation is simple, and straightforward, and I like it.

Security Predicate Functions

The one piece of code you need is an inline table valued function (iTVF) that returns a 1 for the rows that a user should see. You need to have some way to match up a row with a user, and that can be tricky, but if you identify a row, even in another table, you can use it.

For example, I have this table.

    OrderID INT IDENTITY(1, 1)
                PRIMARY KEY
  , Orderdate DATETIME2(3)
  , CustomerID INT
  , OrderTotal NUMERIC(12, 4)
  , OrderComplete TINYINT
  , SalesPersonID INT

There’s nothing in this table that really helps me identify a user that is logged into the database. However, I do have a mapping in my SalesPeople table.

    SalesPersonID INT IDENTITY(1, 1)
                      PRIMARY KEY
  , SalesFirstName VARCHAR(200)
  , SalesLastName VARCHAR(200)
  , username VARCHAR(100)
  , IsManager BIT

Granted, this could mean some change of code, but perhaps you can somehow use a user name in tables to query AD or other directory and map this to a user name.

Once I have that mapping, I’m going to create a function. My function will actually look at the SalesPeople table, and map the parameter passed into the function to the value in the table.

CREATE FUNCTION dbo.RLS_SalesPerson_OrderCheck ( @salespersonid INT )
            1 AS [RLS_SalesPerson_OrderCheck_Result]
            dbo.SalesPeople sp
              @salespersonid = sp.SalesPersonID
              OR sp.IsManager = 1
            AND USER_NAME() = sp.username;

In the function, I look at the USER_NAME() function and compare that to a value in the table. This is in addition to checking the SalespersonID column.

I can use a Security Policy to bind this function to my OrderHeader table as shown here:

CREATE SECURITY POLICY dbo.RLS_SalesPeople_Orders_Policy
  ADD FILTER PREDICATE dbo.RLS_SalesPerson_OrderCheck(salespersonid)
  ON dbo.OrderHeader;

This sets the function, passing in a column from the OrderHeader table, which is the column I want evaluated in the function.When I now query the OrderHeader table, I get this:

2016-06-13 11_42_16-Photos

There is data in the table. However, I don’t get rights by default, even as dbo. My USER_NAME() doesn’t match anything in the table, therefore no SalesPersonID matches. However, for other users, it works.

2016-06-13 11_42_32-Photos

There is a lot more to the RLS feature, but I think it’s pretty cool and it’s something that will be highly used in many applications moving forward, especially those multi-tenant systems.

Go ahead, get the free Developer Edition and play around with RLS.

A New Sample Database

When I started working with SQL Server we had the pubs database. This was in SQL Server 2014, and I only used it because code samples from the Internet used it. It wasn’t a great database, but it was consistent and known. When Northwind came along, it was a welcome improvement. The schema was larger and a better fit for showcasing various features. However those two samples were superseded by AdventureWorks.

AdventureWorks (ADW) was a fictional bicycle company, and we got a much larger, more complex schema. Perhaps overly complex, but many of us have learned to work with AdventureWorks for our sample code and demos through the years. This sample was released with SQL Server 2005 and then expanded and grown with 2008, 2012, and 2014, including newer features like Filestream and Memory-optimized Tables. However the complexity sometimes caused issues for people that didn’t want those features.

There are multiple versions of ADW from MS, a script to make it larger, or even larger, a workload generator, a version for Azure, and more. It seems as though lots of time and effort has been put into building demos and tests against the ADW database. However that might need to change. I saw a note that Microsoft has a new sample database, Wide World Importers, on Github. It’s for SQL Server 2016+ as it includes a number of items that are only available on that version. So far, it’s bare bones (31 tables, many of them System-Versioned), and MS is looking for comments for improvement. There isn’t a lot at Github yet, but I’m expecting that to change over time.

The idea of having sample database is good, but there’s effort to maintain them. We’ve had a few over the years at Redgate for demos, but we may move to using a Microsoft one so we don’t have to maintain it. Our preference is AdventureWorks, and maybe we’ll end up keeping it alive over time. I’m not sure I want a new database, but I also disliked the confusion of which AdventureWorks database versions would work with which SQL versions. Over time the database was named to include the version year, but it could still be confusing.

Perhaps we should get a new sample every version, but not completely new. Let’s have a core schema that’s always available, with some tables that are normalized, some not. Some with PKs and FKs, somewithout. We can add new objects for new features. Let’s ensure that older scripts work by leaving old objects alone, but new items covering all aspects of database development, are included. Let me know what you think today.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.8MB) podcast or subscribe to the feed at iTunes and LibSyn.

Running Multiple Threads Quickly

Recently I was looking to generate a quick workload against an instance. I wanted to add to the amount of work my system was doing by executing a number of queries simultaneously. There are a variety of ways to do this, but I just wanted a quick set of queries. I’ll show a couple ways to do this and then look at a few other tools in later posts.

The Cumbersome Way

I can open queries in two windows in SSMS. Note, each of these will execute 50 times.

2016-05-20 14_17_34-SQLQuery1.sql - (sa (56))_ - Microsoft SQL Server

Now I have two windows, and I can click execute in one, then click to the other, and click execute again. That’s easy. When I do this, I’ll have two threads, each running a query 50 times.

A Better Way

A better way is to use a SQLCMD call with my query in it. In this case, I’ll create a notepad file and add multiple SQLCMD calls in it.

2016-05-20 14_29_03-SalesDemo-2015-12-01-1745-export-i-fgod4b6h - VMware Workstation

The key here is the “start” at the beginning of the line. This will spawn a new thread with the program being called in it. In this case, I’ll get 5 windows very quickly, each running my query. My query is in another file:

2016-05-20 14_28_19-SalesDemo-2015-12-01-1745-export-i-fgod4b6h - VMware Workstation

If each query is set to run multiple times, I’ll have a simple load generated. In my case, I’ll run the .CMD file from the command line, but I could double click it. When I do, I see this:

2016-05-20 14_30_13-SalesDemo-2015-12-01-1745-export-i-fgod4b6h - VMware Workstation

You can see the window where I started the queries in front. Three of the command windows are in the background, each of them running queries over and over. The output from the query, with all the dashes for spacing between the headers and data, are in each window.

More Overloads

I was attending the SQL Nexus and SQL Bits conferences recently in Europe. These were launch events for SQL Server 2016, and the keynotes given by Microsoft included the overload of yet another SQL Server term. This time ACID was taken to be: Algorithms, Computer/Cloud, IoT, and Data. While I appreciate someone trying to be memorable or interesting in a keynote, to me, this creates confusion when we try to discuss the importance and positioning of SQL Server. ACID is a core term for relational databases, having nothing to do with the future. While some marketing people probably enjoyed this, I would have preferred they spent a little more time coming up with a term that describes a creative look at the future.

There have been plenty of “overloaded” terms from the SQL Server platform. DAC, is that a connection or part of a database schema package? Snapshots? We have a few of them. When we talk about logs, is this meaning transaction logs or logs for the SQL Server process? Or even Agent logs. Are clusters meant to be FCIs or traditional shared storage clusters? At least that last one is close to same meaning for both terms.

There are others, which creates confusion and could result in problems if two of think we’re discussing the same topic, but we’re not. I’d hope we’d realize there are ambiguity in a few minutes, but what if we’re dealing with a crisis? Clear meaning and understanding are important.

I don’t mean to be pedantic, but I don’t think it’s too much to ask that new features, new concepts be given new names. While marketing might think there’s a neat sound to an old acronym, for us technical people it’s annoying, and potentially a liability. If we will continue to have marketing people influence the ever growing platform, the least we can have is creative marketing individuals that can come up with new ways to describe the hard work completed by the SQL Server development team.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.7MB) podcast or subscribe to the feed at iTunes and LibSyn.

Half Baked Features

I gave a talk recently on some of the data protection features being added in SQL Server 2016, along with a few that have been available for some time. I think the talk went well, but I point out lots of potential issues with the features themselves. I had a question from someone that noted Always Encrypted, Dynamic Data Masking and more aren’t really full developed, are they?

The question threw me, not because I agree, but because I think that there is a complex answer. Encryption and data security features are easy to use. The features really are easy to implement, especially encryption. Most encryption is just another function call for the developer, which is something almost all of us can write. The mechanics of using these features are fairly easy.

However the architecture, the decisions on how to manage keys and where to deploy features, those are hard. We deal with those relatively infrequently, but when we must make those decisions, we should carefully consider the ways in which our systems might be attacked or mis-used.

Some of the restrictions that I see in various features are unavoidable. At least, I can’t see a way to avoid them. In some sense, things like a binary collation are almost required because of the nature of how encryption must operate on data. Perhaps there are ways to mitigate issues, but I’m not sure. No security mechanism is perfect and all encryption can be broken. However I think the way these features work is good enough in many situations.

There are some things, such as allowing Dynamic Data Masking on Always Encrypted columns, which can be implemented. However, the changes are more complex, and involve not just SQL Server changes, but probably also ADO.NET changes. Making the decision on how to actually implement these changes shouldn’t be taken lightly, and I’m happy to have a working feature that might be enhanced in later versions over not getting any options at all in this version.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.8MB) podcast or subscribe to the feed at iTunes and LibSyn.

Solving Tempdb Issues

While reading Remus Rusanu’s piece on SQL Server 2016 development, there was an item that struck me. There’s a part of the piece where he notes that an engineer at Microsoft realized that there was a tempdb issue with Azure SQL Database. To solve it, a failover was needed, which could be problematic. The basic issue was that tempdb files were sized unequally, discovered after “months of investigation.”

Now, on one hand this seems silly. After all, we’ve known since SQL Server 2005 that the guidance was for all files to be sized the same. Shouldn’t engineers at Microsoft be following all the practices  known for optimizing SQL Server performance? I think all Microsoft people should follow this, especially those working on other products (*cough* Sharepoint *cough*), but at the very least SQL Server engineers should have a huge list, perhaps with PBM or unit tests, that warn about non-compliance.

On the other hand, since we’ve known this is an issue since SQL Server 2005, why does SQL Server allow this? I’d think some simple tooling and a few checks in code could eliminate this as an issue in any new install. Catch the issue during upgrades and grow (or shrink) files to be the same size, or at least warn the administrator. I know there are situations where you need to add a new file in an emergency, but shouldn’t this be an item we push administrators to correct quickly after the emergency is past? Or at least force them to consciously make an exception.

There are plenty of limitations and restrictions in SQL Server systems that Microsoft forces on us. We have limits on where new features can be used, various historical limits on function parameters, and more. Why not also just enforce limits that prevent issues like this? I’m sure people will complain, but wouldn’t those complaints be outweighed by less issues overall from all customers?

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.3MB) podcast or subscribe to the feed at iTunes and LibSyn.

Cloud First

SQL Server 2016 is the first “cloud first” release of SQL Server, as told to us by Remus Rusanu, former Microsoft SQL Server development team member. The features and enhancements in SQL Server 2016 have been running in the Azure SQL Database cloud for some time before they will be packaged up and released as the on-premise SQL Server product that most of us expect to work with.

There are a number of items about this worth discussing, but one stands out to me. The idea of using large numbers of feature flags and deploying code to the “cloud” for use by customers and internal Microsoft people is interesting. On one hand, it’s really just a faster way of having beta software examined by users other than developers, with no install/uninstall/upgrade for users. Speed is good, but being on the bleeding edge and having systems break isn’t necessarily what any of us want. However the use of these flags to turn features off quickly means that disruptions can be minimized for individual customers.

Some of the engineering process changes made to be cloud first were important for Microsoft to have one main branch of code for SQL Server. Having a separate branch for the cloud and on-premise versions had to be inefficient and problematic. However, that’s not something we, as customers, care about. We just want the database platform to work, wherever we run may run it.

I do think that having features come out in Azure, whether private testing, public preview, or general availability is good. The people that can test these features give feedback quickly, and the rest of us aren’t affected by the problem code. More importantly, the developers at Microsoft get the chance to learn more about how the features will be used and tune them before a very wide set of on-premise customers get code. Personally I was glad to see things like Row Level Security (RLS) appear in Azure first (Mar 2015), and come later to the on-premise version (SQL Server 2016).

I really expect in the future that almost every feature that is added to the SQL Server platform will be run in some form in Azure before it’s released for on-premise use. In fact, I’d be surprised if features are added to the on-premise without 6-12 months of testing in the cloud. That’s good, and I hope it leads to an increase in code quality for those of us that will run SQL Server on our own hardware.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.0MB) podcast or subscribe to the feed at iTunes and LibSyn.