Skip to content

Visualizing the Tally Table

I was reading Dwain Camps’ article on Time Slots and thought it was a very interesting solution to a problem I’ve had a few times. Getting time slots inside of a period that I want to query. If you have a similar need, or want to learn more, I’d urge you to read the article.

It’s always easier to join to a set of data that matches what you need than to try and filter out other rows. SQL excels at joins, so whenever possible you want to join to data. As such, when I was looking at Dwain’s code, I thought the way he listed the tally table was very interesting. I’ve seen plenty of these generated, but I hadn’t run across someone spelling it out in comments. In case you are wondering, Dwain had code like this:

WITH Tally (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1 -- zero-based -- Returns exactly 86400 rows (number of seconds in a day) FROM (VALUES(0),(0),(0),(0),(0),(0)) a(n) -- 6 rows CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n) -- x12 rows CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n) -- x12 rows CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n) -- x10 rows CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e(n) -- x10 rows ), -- = 86,400 rows

That’s a great visualization, and one I plan on using in the future. It makes it easy to see what’s being generated and at what scale.

For example, I can easily do this:

SELECT ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) FROM ( VALUES ( 0), ( 0) ) a ( n ) -- 2 rows

which returns a single column table with the values 1 and 2 in it. Two rows.

Let’s say I want to now build a list of 12 rows. I could do this a few ways. One is to multiple 2 x 6 and get 12.

SELECT ROW_NUMBER() OVER (ORDER BY ( SELECT NULL)) FROM ( VALUES (0), (0) ) a(n) -- 2 rows CROSS JOIN ( VALUES (0), (0), (0), (0), (0), (0) ) b(n); -- x 6 ;

Or I could give myself more flexibility to add and remove data with comments by doing factorials. How about 2 x 3 x 2 = 12?

SELECT ROW_NUMBER() OVER (ORDER BY ( SELECT NULL)) FROM ( VALUES (0), (0) ) a(n) -- 2 rows CROSS JOIN ( VALUES (0), (0), (0) ) b(n) -- x 3 CROSS JOIN ( VALUES (0), (0) ) c(n) -- x 2 ;

That gives me the same result: 12 rows. Of course, I can easily expand this quickly to thousands of rows.

The technique isn’t anything new, but the visualization is interesting, and to me, this is much easier technique to see and understand when you run into it in code. Right away I know I’m generating xx rows and I can easily see how to grow or shrink the number of I have the need.

Who Am I?

I saw Brent Ozar updating his About Me slide, and thought it was interesting. Not sure if I like what he’s done, but it’s different, which usually attracts attention.

I updated mine last year, with some help from the people at Red Gate. Here’s my slide.

2015-05-26 18_30_25-CI for Databases.pptx - Microsoft PowerPoint

No description of jobs, what I’ve done, my age, company, etc. No MVP or other awards or certifications.

I wasn’t sure about this early on, but I like it. I can tailor the talk about me to the situation, making it short or long. I also include the information on the right at the end of the deck as the last slide, so people can grab it there.

I’m not sure if this works, but I have seen a bit more traffic to my blog and more Twitter/LinkedIn activity.

I also get some fun comments when I’m actually wearing this shirt at a talk.

Changing NULL to NOT NULL – Dealing with Data

I wrote recently on how to alter a column in SQL Server from NULL to NOT NULL in a simple way. However I didn’t cover some of the cases where you have data in the column, or other restrictions that you might need to deal with. In this post, I want to look at a few options if you have data in the table.

Let’s take my SimpleTable and add data:

INSERT Simpletable VALUES (1, A, 1) , (2, B, 0) , (3, C, NULL) , (4, D, NULL) ;

Now I have 4 rows. If I run this:

ALTER TABLE dbo.SimpleTable ALTER COLUMN Status TINYINT NOT NULL;

I get this error:

Msg 515, Level 16, State 2, Line 2

Cannot insert the value NULL into column ‘Status’, table ‘Sandbox.dbo.SimpleTable'; column does not allow nulls. UPDATE fails.

The statement has been terminated.

Essentially, SQL Server can’t mark this column as NOT NULL because there are NULLs in there. What can I do here?

UPDATE the Table

The only thing I can do is to change the data. I can run a query that removes all the NULLs in the column by setting them to some value. That’s what I’ll do here. I can choose a value here and run an UPDATE statement. Something like this:

UPDATE dbo.SimpleTable SET Status = 0 WHERE status IS NULL;

Once I do that, I can easily run my ALTER and it succeeds.

 

Unit Testing in Philadelphia

I’m helping teach a pre-conference session on Friday, June 5, 2015 at the Microsoft office in Philadelphia. This is an all day, paid for event, that looks at how you can use a framework to write unit tests for your T-SQL code. We’re the day before SQL Saturday #390 in Philadelphia

I really believe in testing, and am trying to advocate for it in many places. I’ve delivered a testing session that has been well received at quite a few events and this is the first time I’m trying a full day training class.

I am working with Sebastian Meine, the founder of tSQLt, to present the class. We’ve got a busy outline, looking at a variety of ways that you can write tests and use them to find problems in code. Here’s what we’re covering.

  • Introduction to Unit Testing
  • What is tSQLt?
  • Your First Test
  • Executing Tests correctly
  • Effective use of Assertions
  • Separation of Concerns
  • Testing Exceptions
  • Test Case Heuristics
  • Dealing with Test Data
  • Other Types of Testing
  • How Unit Testing fits into your Development Process

At the end of the class, you should have some good ideas on how to build and structure tests in your own environment and be ready to start testing on Monday.

I hope to see you there, and register today if you want to learn more about unit testing in SQL Server.

Native Audits

As our databases contain more and more information, it is becoming increasingly more common to implement some sort of auditing feature to ensure that our data is not being inappropriately accessed or altered. It’s not only the management of our organizations, but also regulatory rules are becoming more numerous, and some may think onerous, which increases the burden on the DBA.

There are numerous products available to help here, but SQL Server has included it’s own internal features since SQL Server 2008. However, it seems that I encounter many people that are unaware of the SQL Server Audit feature, and indeen, may not be aware of the Extended Events system on which it’s based. This week I wanted to ask if you are using this feature.

Are you using SQL Server’s native Server Audit and Database Audit features?

If you can share and reasons or details on how you are using the audits, or how easy it is to administer them, it would be interesting. I think we do need to increase the level of auditing on our systems at least to ensure that we are not experiencing any inappropriate data access. At a minimum, we should be monitoring for privileged level access or account changes.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 1.8MB) podcast or subscribe to the feed at iTunes and LibSyn.

DB_Owner Querying for Database Options

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

What can a user with the db_owner database role do? I assumed anything inside of the database (create/alter objects, assign permissions, back up the database, etc). However someone asked recently about whether someone could read database properties. I’d assume they can, but I needed to check.

I decided to start by creating a new login. I have lots on my test instance, but I went with just building a new one. I used the GUI to add [DBOwnerTest] because it’s quicker. I assigned a password, set a default database, and mapped this user to db_owner in a database.

dbowner_a

I then opened a query window and changed the connection:

dbowner_b

Now I could easily run a query for properties and see the results:

dbowner_c

Note the connection at the bottom of the image above.

Here’s where it’s a little interesting. I disconnected Object Explorer and connected back as DBOwnerTest. I see this:

dbowner_d

Looks normal. According to the BOL documentation for permissions, db_owner gets View Any Database as well as control over their own database. I can see the properties of the Sandbox database (where I’m db_owner).

dbowner_e

But I can’t see properties of other databases.

dbowner_f

I also see my login, but not the couple dozen others I have, other than sa. That’s curious, and perhaps not good. However this isn’t the place to delve into that.

dbowner_g

I have the ability to query through databasepropertyex(), which I’ve documented in another post.

Certainly db_owner conveys lots of rights to the user, and certainly the ability to see some of the outside of the database container, such as the options and properties, as well as other databases.

Writing

This was based on a question I saw posted at SQLServerCentral. It took me about 5 minutes to set up a test login and query for information. I had to perform a few searches and try some queries. I spent a few minutes researching databasepropertyex(), which became another post.

All told, this was about a 15 minute post.

You can do this. Join the #SQLNewBlogger group and start documenting your career. You can see all my posts that fall into this area by looking through the SQLNewBlogger tag here.

References

Permissions of Fixed Database Roles – https://technet.microsoft.com/en-us/library/ms189612%28v=sql.90%29.aspx

Naming Confusion

When I first worked in a corporation that had many servers, I had to learn the encoding of our systems. We had SUIS01, SUIS02, NAIS01, NAOP01, etc. Each name designated a location and department, but no information as to what applications were running. That was a separate thing to learn, which thankfully didn’t change often as procuring and setting up a new system was a relatively rare event.

However I also ran into a few systems named “Dorothy” and “Wiz”. The other administrators griped constantly about the unprofessionalism of those names. I’m not sure I think SUIS01 is better or worse than Dorothy, but that’s a separate debate. In both cases, we had to memorize what applications were actually running on which systems. Since all of them were managed by IS, regardless of the department that owned them, we were responsible for understanding the purpose of the server.

I ran into a situation recently where a company had a server called SalesSQL, which was there production server. The development server was SalesSQL_Dev, which is fine until you see someone opening SSMS, thinking they’ve connected to the right server since they see “SalesSQL” in the connection dialog and starting to execute code. Granted, even a rookie should be able to notice the “_Dev” (or a similar “Dev_SalesSQL” name), but mistakes happen.

I wonder sometimes if naming systems too closely to similar systems in different environments is a good or bad idea. I especially get concerned if shortcuts are taken with database names like “Sales” and “Salesqa”. I think it can be easy to confuse those systems when you become stressed or are in a hurry and are glancing at a small drop down box.

Ultimately I’m not sure there’s a good method for naming systems that prevents confusion, and ensures people are working in the correct environment. I have tended to use suffixes in my career to denote development and test environments as I can more easily differentiate a dev environment, but I wonder if I shouldn’t be using prefixes instead. Having to retype the beginning of a name (or delete the “dev”) in the connection dialogs might force me to do extra work to connect to a production system. That little extra bit of effort might be just the thing to reduce the chance of performing development work on a live system.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.5MB) podcast or subscribe to the feed at iTunes and LibSyn.

Controlling Alerts

None of us likes to be on call for our organizations. Late night phone calls, especially for something silly like adding a new user to a system or rebooting a machine, can make for irritable employees in the morning. Too many nights in a row of problem calls, legitimate or not, and I’d argue that an employee’s effectiveness during the day is substantially diminished.

Since many of us are paid to be creative and improve systems over time, is regular firefighting a good use of our knowledge? This might be even more true if we find ourselves receiving escalation calls on a regular basis and our daily workload doesn’t change.

I ran across a company that had set up a rule for alerts and pages. If any alert was set up in an automated system, it had to have corresponding documentation, with a resolution, in an operations manual. That’s an interesting idea, and certainly promotes the idea of performing root cause analysis and understanding your systems in advance. However does this really reduce any escalation? 

Perhaps if there is a team available 24/7 that triages initial alerts this helps, but if there’s a small team of people that must support systems, is it worth the effort to document every issue? I suspect that a better policy is that repeating alerts, those that occur regularly, need to be analyzed and a more permanent solution implemented. I know that’s easy to say and hard to do, but that really is the best policy. At least if you want to ensure your employees are working efficiently over time.

For those calls that are ad hoc, I recommend a log of actions taken be kept in an easily accessible location, in a text format for searching. That way if a different person receives a similar alert, at least they’ll have some idea of what actions the last person performed.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.6MB) podcast or subscribe to the feed at iTunes and LibSyn. feed

Refactoring Mistakes Are Why We Write Tests

I wrote a short piece the other day trying to show how one can use tSQLt to test code. It’s a simple test built against a user defined function. It works well and when the test is run, it passes.

Here was my code:

ALTER function [dbo].[calculateEstimateOfReadingTime] ( @value varchar(max) ) returns int as begin declare @ret as int = 1 , @i as int = 1; while @i <= len(@value) begin if substring(@value, @i, 1) = ' ' begin set @ret = @ret + 1; end set @i = @i + 1; end return @ret / 250; ; end

Someone in the comments pointed out that we can more efficiently refactor this code to :

ALTER FUNCTION [dbo].[calculateEstimateOfReadingTime] ( @value varchar(max) ) RETURNS int AS BEGIN RETURN ( SELECT LEN(@value) - LEN(REPLACE(RTRIM(@value), ' ', '')) + 1 ) END

However when I run the test, I get these results:

functiontesta

That’s not good, but that’s why we test.

I could easily see someone refactoring the code, finding a more elegant method of rewriting this code and after running some quick tests, they check this in to source control (hopefully) and maybe deploy it to production. Hopefully QA catches this, but wouldn’t we want to notice this in development?

The refactored code misses a divide by 250.

Write tests, use them to catch mistakes. These simple ones slip through at times and are what make deployments really, really stressful.

VMs are not VMs

I was at VMWare recently. One of the main things that all of the SQL Server professionals that were there tried to emphasize is that SQL Server workloads are not like other workloads. The impact on the various host resources, the stress on the storage systems, these are fundamentally different in a database server. The loads tend to be higher, but not always, however the tolerance for delays tends to be lower than for many other types of applications.

This becomes an issue if you work in an organization that doesn’t understand the challenges of database systems. It’s entirely possible that your virtualization administators, or your storage administrators don’t recognize that the SQL Server might need more resources. Or they don’t believe the impact is greater for the organization. To be fair, that might be true, but someone other than the DBA or system administrator should decide if the database is more important than the file server and should be treated differently from an infrastructure perspective.

No matter what level of resources your database server need, it’s not going to run like other systems. Typically this means that the density of VMs has to change when a database server is involved. As an example, I know of a system that typically has a 10:1 guest:host ratio for most of their server systems. However for SQL Servers it’s 4:1 or lower. The same is true for storage. Aggregate bandwidth doesn’t always reflect the ability of a storage system to keep up with database requests. It becomes important that both you and your storage administrators learn to speak the same language and understand what requirements exist for SQL Server VMs.

Virtualization really starts to highlight the advantages of a DevOps environment. DBAs and developers should work closely with the virtualization and storage administrators to learn what each others’ requirements are and how each of us can help the other perform their particular job at a higher level. Infrastructure staff can help prepare standard environments and ensure production looks like staging. Developers and DBAs can help a vSphere admin learn a little PowerCLI and programming. That might get them to be more cognizant of the particular requirements of your SQL Server and be more willing to work with you.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.8MB) podcast or subscribe to the feed at iTunes and LibSyn.

 

Follow

Get every new post delivered to your Inbox.

Join 5,260 other followers