The Lighter Side–Morning at the Redgate Office

I was chatting recently with someone about the coffee machines at Redgate Software. We have 4 of them, all located in the downstairs lobby, which provide a way for people to take a break from their desk or between meetings, and wander down for a cup.

They are great machines, and I show some video recently to share with a few friends,

I visit this machine often when I’m in the office, going between lattes and black coffee throughout the day.

Great Developers

This editorial was originally publised on March 12, 2012. It is being re-run as Steve is on vacation.

Is a great software developer worth 100 average ones? On one hand I think there are some good arguments that it’s not true. One developer certainly can’t write the amount of code that 100 average ones can. However there’s another way to look at things. A great developer can do things that the 100 will never think of, or never consider. He might not write the code that does as many things as 100 people, but I think a great developer could easily write code that performs a hundred times faster than the code 100 developers write.

That’s why you always have an open position available for a great developer. If one is available, and they rarely are, you hire them if they want to work for you. You can always find things for them to do, and they can make improvements in code that your other 5, 10, or 20 developers will never come up with. I’d make sure they fit in your team and get along with others. You can get less work done if you have someone that is too difficult to deal with or too critical of others. While a great developer can accomplish things that others can’t, or won’t, they can’t do all the work.

Ultimately I think that managing great developers is hard, and they are unlikely to stay with your company for a long period of time. However they are rarely available, and for a few years, they might jump start the evolution of your software, and potentially build something that makes your software great. I’d always have an open spot in my team for a great developer, and hire them as soon as they came available, if I thought they would fit in well with the rest of the team.

Steve Jones

A Basic Recursive CTE and a Money Lesson

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

When I was a six or seven year old, my Mom asked me a question. She asked if I’d rather have $1,000,000 at the end of the month, or a penny on day 1, with the note that each day of the month, she’d double what I’d gotten the first day. Doing quick math in my head, $0.01, $0.02, $0.04, etc, I said a million.

Was I right? Let’s build a recursive CTE.

Recursion is an interesting computer science technique that stumps lots of people. When I was learning programming, it seemed that recursion (in Pascal) and pointers (in C), were the weed out topics.

However, they aren’t that bad, and with CTEs, we can write recursion in T-SQL. I won’t cover where this might be used in this post, though I will give you a simple CTE to view.

There are two parts you need: the anchor and the recursive member. These are connected with a UNION ALL. There can be multiple items, but we’ll keep things simple.

I want to first build an anchor, which is the base for my query. In my case, I want to start with the day of the month, which I’ll represent with a [d]. I also need the amount to be paid that day, which is represented with [v]. I’ll include the $1,000,000 as a scalar at the end. My anchor looks like this:

WITH myWealth ( d, v)
       AS (

— anchor, day 1
           SELECT
              ‘d’ = 1
            , ‘v’ = CAST( 0.01 AS numeric(38,2))
           UNION ALL

Now I need to add in the recursive part. In this part, I’ll query the CTE itself, calling myWealth as part of the code. For my query, I want to increment the day by 1 with each call, so I’ll add one to that value.

SELECT
   myWealth.d + 1

For the payment that day, it’s a simple doubling of the previous day. So I can do this a few days: addition or multiplication. I’ll use multiplication since it’s easier to read.

SELECT
   myWealth.d + 1
, myWealth.v * 2

My FROM clause is the CTE itself. However I need a way to stop the recursion. In my case, I want to stop after 31 days. So I’ll add that.

FROM
  myWealth
WHERE
  myWealth.d <= 31

Now let’s see it all together, with a little fun at the end for the outer query.

WITH  myWealth ( d, v )
        AS (
  — anchor, day 1)
             SELECT
                ‘d’ = 1
              , ‘v’ = CAST(0.01 AS NUMERIC(38, 2))
             UNION ALL
— recursive part, get double the next value, end at one month
             SELECT
                myWealth.d + 1
              , myWealth.v * 2
              FROM
                myWealth
              WHERE
                myWealth.d <= 31
           )
  SELECT
      ‘day’ = myWealth.d
    , ‘payment’ = myWealth.v
    , ‘lump sum’ = 1000000
    , ‘decision’ = CASE WHEN myWealth.v < 1000000 THEN ‘Good Decision’
                        ELSE ‘Bad decision’
                   END
    FROM
      myWealth;

When I run this, I get some results:

2016-05-17 18_48_04-Start

Did I make a good choice? Let’s look for the last few days of the month.

2016-05-17 18_48_16-Start

That $1,000,000 isn’t looking too good. If I added a running total, it would be worse.

SQLNewBlogger

If you want to try this yourself, add the running total and explain how it works.

Changing Your PASS Credentials

I got an email today from PASS, noting that credentials were changing from username to email. That’s fine. I don’t really care, but I know I got multiple emails to different accounts, so which account is associated with which email?

I clicked the “login details” link in the email and got this:

2016-05-24 12_31_06-PASS _ User Login

Not terribly helpful, but I was at least logged in. If I click my name, I see this:

2016-05-24 14_09_26-Movies & TV

Some info, including the email, which I’m not sure is linked to the email I clicked on, or is based on browser cookies. However, there’s no username here.

If I click the edit profile link, I get more info, but again, no username. No way to tie back anything I’ve done in the past to this account.

2016-05-24 14_12_28-Movies & TV

I have always used a username to log into the SQLSaturday site, so I went there. On this PASS property, I’ve got my username.

2016-05-24 14_14_46-Movies & TV

If I click the username, I go back to the PASS site, to the MySQLSaturday section, but again, no link to this username. However I realize now which email is related to which username.

Hopefully the others will go dormant soon and I won’t get multiple announcements, connectors, ballots, etc.

The point here isn’t to pick on PASS as much as it is to point out some poor software and communication preferences. Changing to email from username (or vice versa) can be a disruptive change. I’d expect the email would include some information on username and email relation, or at least username since it was sent to a specific email. That would allow me to determine where I might need to contact PASS to update things, or which username was affected for me.

I’d also expect that the username to be stored somewhere and visible on the site. Even if this isn’t valid login information, why not just show it? When we migrated SQLServerCentral from one platform to another, we kept some columns in the database that showed legacy information. This information wasn’t really used, but it did help track down a few problems we had with the migration. Having a bit of data is nice, and it doesn’t cost much (at least in most cases).

This wasn’t a smooth process, though not too broken for me. I like that PASS sent the communication, and I’m glad the old method still works. I logged in with username today. I wish there was a bit more consistency between PASS applications, and that they included a date when username will no longer work. I also hope they update their testing (or test plan) with any issues they discover this week, so the problems aren’t repeated.

Recharging

It’s that time of year when many people take vacation and get away from work for a bit. I’m going to join in, taking a few days off this week. I was off yesterday, with family in town for my middle son’s high school graduation.

The extended family is leaving, but this week my wife, kids, and I will take a few days in Steamboat Springs, looking to recharge and relax before everyone gets on with their busy summer.

I thought about taking a computer, maybe doing a little “fun” coding, but I think I’ve decided I’ll stick with a bike, a guitar, and just unwind in an unwired fashion this week.

As much as many of us like computers, it’s good to get away, and find something in your life that’s a change for a few days.

The Clustered Index is not the Primary Key

I was reading through a list of links for Database Weekly and ran across this script from Pinal Dave, looking for tables where the clustered index isn’t the PK. It struck me that this is one of those facts I consider to be so simple, yet I constantly see people confusing. If you click the Primary Key icon in the SSMS/VS designers, or you specify a PK like this:

CREATE TABLE ForSomething
  (
    SomeUniqueVal INT PRIMARY KEY
  );

What will happen is that a clustered index is created on this field by default. It’s not the the PK must be clustered, but that SQL Server does this if you don’t tell it otherwise. Tables should have primary keys, and while you can debate that, most knowledegable SQL Server people I know want a PK on tables. There are exceptions, but if you can’t name them now, use a PK.

However the PK isn’t a clustered index. They are separate concepts. The PK can be clustered or non-clustered, and what you choose it up to you. However, I like Kimberly Tripp’s advice. Choose the clustering key separately from the PK, keep it unique, fixed, and narrow.. If they’re the same, fine, but don’t try to make them the same. Choose what works well for your particular table, which means thinking a bit.

You get one clustering key, and it’s worth spending five minutes debating the choice with a DBA or developer, or even post a note at SQLServerCentral. Changing the choice isn’t hard, but it can interrupt your clients’ work on your database, so try to make good design choices early, without blindly accepting defaults. It’s worth a few minutes of your time to make a good choice.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio (2.0MB) podcast or subscribe to the feed at iTunes and LibSyn. feed

What’s Your Test Plan?

I ran across a post on upgrading a SQL Server instance, where the original poster (OP) was asking about a document to upgrade from SQL Server 2008 to 2014. That’s a big ask, especially as not many documents tend to be written to go across three versions. The official ones, or the people that document well, tend to upgrade every version, and so they have a document to go from 2008 to R2, and from R2 to 2012, etc. However, given the pace of Microsoft releasing things, perhaps we ought to build a document and checklist across every 2-3 versions since many people may be upgrading from 2005/2008 to 2014 (or soon, 2016).

Apart from having the document, one of the questions was a list of what to test. That struck me, as I’m big on testing, and have tried to keep comprehensive plans when I had important systems. However, many of you might be like me and don’t consider most systems to be “important”. For those systems, a patch, a Service Pack, an application upgrade was really tested by applying the change to a test server and asking users to see if they could use the application. I’m not confident that there was any sort of comprehensive look at the system in these cases, but this system worked most of the time.

There were some instances that we deemed important, usually measured as such because a failure would mean some high level manager would call my manager’s boss and smelly things would slide in my direction. In those cases, we had a list of the functions and processes that needed to work. These could be application functions, queries, ETL packages, reports, or anything that would cause a user to complain. This list became our test plan, and it was kept up to date. Usually back dated, since we weren’t sure what new things were important until they failed for some reason, but once we received a ticket on an item, we added it to our list. We went through the entire list for upgrades, ensuring each item worked.

I’m wondering, do many of you have a test plan for your systems? Any system? It doesn’t matter if it’s automated or manual, but if you had to patch/upgrade instance X, are there a list of things you’d verify? Or is the system not that important, so you’d just make sure the database service was running? Let us know what your test plans look like.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.1MB) podcast or subscribe to the feed at iTunes and LibSyn.

The Complexity of Branches

Branching code is hard.  Well, branching isn’t hard. Just right click in some GUI or or type “git branch” and off you go. The actual branching of code is pretty simple and painless. In some sense, branching might be too simple in today’s tools as developers are almost encouraged to branch in a VCS and “test something” quickly without a lot of thought.

Even merging has become easier, with tools to visual code changes, and automatic merging in many cases. Despite the hassles of merging, it often isn’t too difficult, which can mask the complexity of branching.

However branching can easily devolve into a nightmare of multiple versions of code, with various developers unsure of the impact of their changes on a merged code base. This is especially true in a large code base where many developers each have their own branch where they work. With pressure from multiple individuals to patch bugs, add features, and streamline performance, I’ve met developers that end up building changes in three or four branches at the same time.

Merging all the work together, into the correct main branch, becomes tricky.  We’re all human, and I’ve seen developers merge a change into the wrong branch. Tools try to limit this, but it still happens. The more branches that are in use, the more time that ends up getting spent on the administrative action of reconciling code differences rather than building new code (or fixing old code).

Understanding our code is hard enough, without adding the additional complexity of versioning multiple files and objects. My philosophy has always been to limit branches as much as possible, usually only allowing one, or at most, two, per team. We have also endeavored to merge back changes every few days, just to limit the amount of time spent performing merge reconciliations. I’d rather have developers have the pain of merging daily on a shared branch than dealing with branch merges less frequently (usually more complex merges).

We’ll always have mistakes made, but using automated testing and continuous integration builds on all branches can help limit the number of obvious mistakes that break code. This won’t prevent bugs, but our tests can help reduce the number of bugs, especially regression bugs, over time. At least, they can if we use testing on all branches, all the time.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.2MB) podcast or subscribe to the feed at iTunes and LibSyn.

Changing a Computed Column–#SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

I was working with a computed column the other day, and realized I had the wrong definition. In this case, I was performing some large calculation, and the result was larger than an int. However the first part of the formula was an int, which resulted in an implicit conversion to an int.

I needed to change the formula, and then realized that plenty of people might not work with computed columns much, and not realize how you alter a computed column.

You don’t.

In fact, you need to drop the column and add it back. In my case, this was what I did. Here was my table:

CREATE TABLE SiteStats
(
StatID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED
, StateDate DATE DEFAULT SYSDATETIME()
, StatMonth TINYINT
, StatYear int
, PageVisits INT
, TimeOnSite TIME
, Engagement AS (PageVisits * DATEDIFF(SECOND, CAST(’00:00:00′ AS TIME), TimeOnSite))
)

I wanted to cast the PageVisits part of the column to a bigint to solve the issue. I first needed to do this:

ALTER TABLE dbo.SiteStats
DROP COLUMN Engagement

Once that’s done, I can do this:

ALTER TABLE dbo.SiteStats
  ADD Engagement AS (CAST(PageVisits AS BIGINT) * DATEDIFF(SECOND, CAST(’00:00:00′ AS TIME), TimeOnSite));
GO

Now I have a new definition that works great.

Some of you might realize that this could be an issue with columns in the middle of the table, and it is. However you shouldn’t worry about column order. Select the columns explicitly and you can order them anyway you want.

SQLNewBlogger

A quick post, five minutes. Even if you had to search for how this works, you could do this in 10-15 minutes, tops. Research, write why you did this and potential issues with your system.