Skip to content

Parsing SQL Saturday Data – Looping Through And Loading All XML Files

After my last post on parsing the XML, I decided to continue forward and get ready to put the data in a database. For that, I’m really looking for this data:

  • event ID
  • session title

With this, I can easily insert data into a table. I’ll have separate tables for the events themselves and the speakers, but for now, I can easily showcase the titles of the sessions.

With that in mind, I decided to start expanding my efforts and building a series of loops that get all the data from the XML documents.

Looping through all files

The first thing I needed to do was loop through all the files I’d downloaded and get the documents loaded. I decided to use a DO loop for this, since I should be doing this at least once each time. Eventually I’ll add logic to avoid downloading files I’ve downloaded already.

Here’s the basic code:

$loop = 1

$loopend = 450
$doc = New-Object System.Xml.XmlDocument

do {
#start large loop

  # get the filename
  $sourceURL = $baseURL + $i + ".xml"

  # do other stuff

  $i++

#end outer loop
} while ($i -lt $loopend)

This is the basis for looping through all the file names, based on my downloads. A quick test shows this is building all the filenames I need.

Loading files

The next step is to actually load each XML file in and start querying it. I changed from the parsing code to use a loop since I’ll need to insert each item separately and I don’t think the code I had from the previous article will work. At least, I haven’t found a way.

If you know of one, let me know.

I used the Test-Path method to be sure that the XML exists, as there was at least one lost event in my initial download. I think that’s fixed now, but in any case, I added this code:

#test the path first. If it exists, load the XML
if (Test-Path $sourceURL) {
  $doc.Load($sourceURL

#trap the event number. This will be the ID I use in the database table.
    $event = "SQL Saturday #" + $i

That seems to work fine, and with with $event variable, I know which event the sessions are associated with.

Next Steps

That’s all I wanted to put here, giving me a nice, simple way of going through a series of files in a pattern. From here I’ll add more detail to the inner loop that gets the session titles out of the XML document and displays it.

The Vacation Conundrum

Would I want unlimited time off? It’s an interesting question. I think I might like something more like a minimum time off, or perhaps an allowance, with the tracking and offer to work with employees that need more. There are problems with unlimited vacation, and it’s often because of poor management and social pressure of colleagues.

Or, of course, workaholic syndrome, which is common in the technology industry.

Really, I’ve never had an issue with technology, personally. I work mostly the stated hours, but because I’m productive, I never have had problems with managers if I come late or leave early. To be fair, that’s my experience, and if everyone acted as I did, perhaps that would be problematic.

At Red Gate, I’ve been able to take days off for something, usually skiing, as I want. Actually, I usually only take a partial day off, and have to work part of the day to be sure the SQLServerCentral newsletter is prepped and going out. Or answering email, dealing with site issues, etc. I’ve even taken meetings on the road, or halfway to the mountains, pulling over to chat.

The last few years I’ve not wanted to deal with that and typically book a day off in our HR system and plan to be away. If things break, I ignore them. Or someone else needs to be ready to handle them since I’ve taken vacation and I’m not going to be pushed into working when I have booked time off.

The downside of this, which is more specific to my job, is that I have daily things that need handling, like the SSC site. As a result, if I take a day off, that usually means I’m putting in more time before the holiday to prep things. I’ve never done that as a DBA or developer. I had deadlines, but I didn’t work extra because of vacation, unless I had put myself behind.

If someone else sends an unrealistic schedule, I’m not killing myself, or skipping vacation.

It’s a tough question, and I need to think more about it. The pieces linked above are interesting and worth reading.

SQL Injection, Still?

It seems as though SQL Injection issues are still around. Attacks from SQL Injection are on the uptick as we begin moving through 2015. As noted in that piece, the constraints put on software developers usually mean that testing and security are the first things to let go when time becomes short. That’s sad, and in some sense, I think this means that we aren’t teaching secure coding early enough to developers, and certainly not often enough.

I ran across a piece from Kevin Kline that asks why this keeps happening. After all, as Kevin shows, much SQL Injection is easy to prevent. The coding patterns and tools we use are simple to write. There are lots of articles out there that show a variety of techniques you can cut and paste into your code. However there are two big problems that prevent us from eradicating SQL Injection: aging code and bad habits.

There’s no shortage of code that comes from frameworks and application templates, not to mention naive or ignorant developers. Lots of this code is vulnerable to SQL Injection. Since so many of these existing applications work, there is no great pressure to go back and change them to be more secure. Since data theft may not even be noticed, there are plenty of companies (and technologists) that have no idea their systems are vulnerable.

The other problem is bad habits. Far too many developers and DBAs have spent years writing insecure code. When they prototype, mock, or quickly knock out code, they often rely on their experience to get work done quickly. And they do this poorly if they aren’t writing in a pattern that prevents SQL Injection. They haven’t updated their templates, tools, or their knowledge in a way that ensures all their future code will be secure.

Ultimately we as an industry need to take SQL Injection seriously and write better code. Whether you use an ORM, a framework, or anything other shortcut to build applications, if you don’t create secure software, you’re part of the problem.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.3MB) podcast or subscribe to the feed at iTunes and LibSyn. feed

Salaries and Experience

This editorial was originally published on Aug 31, 2010. It is being re-run as Steve is on vacation.

One of the main reasons that I got into IT was the salary. Honestly, there were relatively few places where I thought I had a skill AND could make good money. I graduated with an economics degree, with the idea that I’d go to Wall Street, follow Gordon Gecko, and work 140 hours a week and make a few million.

That was until I wrecked my shoulder playing rugby my senior year, realized that I wasn’t invincible, and decided there were more important things than my salary. However I still wanted to have a good living, I had some skills in technology, and I moved in that direction. When I heard that our DBA was making $90k a year at a time when I was making $45k, I decided I should learn more about databases.

Across the last two decades of my career, I think I’ve been fortunate in my career. It’s been a lot of work, , and a lot of fun. I’ve tried to grow my career, learn new things, and find better opportunities over time and in new positions. I’ve had success, and I’ve found the experience has resulted in higher salaries for me. I founda blog about salaries recently that says it’s not only experience, but also location that matters in terms of salary. That’s probably true, though I think industry, the particular company, and other factors are important as well.

For most of us, we don’t necessarily want to change locations, but we can do something about our experience. I’ve seen it written before that for people that have 10 years of experience, it could be gained two ways. They could have one year of experience ten times, or ten different years of experience (or some combination). The point is that you should be continually expanding on your experiences, not just relying on skills you learned a long time ago.

You may or may not be able to get a job that pays you more with more experience. I’d like to think, however, that any investment in improving your skills and learning more about your craft, is never wasted.

Steve Jones

Desktop Repairs – Power Supply Issues

My desktop didn’t restart after vacation a few weeks ago. That wasn’t what I wanted to have happen with a lot of work to get done before some travel. Needless to say I wasn’t thrilled when it happened.

Photo Jan 30, 8 35 47 AM

I was worried that something major had died. However I’d replaced the main boot drive recently and I had the old one. If I could get power.

I emailed Glenn Berry, who’s my go to hardware consultant. He suggested I check the power supply since I couldn’t get anything to happen. On his advice, I went to the local Microcenter and bought a new 650W power supply. Here’s the old one, with a stock CPU cooler. I bought a new one of those as well, since I find the heat alarm going off during some video editing.

Photo Jan 30, 10 11 59 AM

I didn’t remove the old power supply at first. I actually disconnected it and connected the new one to the motherboard from outside the case and powered things on. Sure enough, everything came on. Lights, drives, etc.

I then powered things down again and set about replacing the power supply. It’s actually easy, and I’m surprised I’ve never had to do this before in my life. Probably because I’ve tended to replace machines so often.

In any case, it was a matter of removing 4 or 5 screws from the back of the case, carefully slipping out the old PS and then putting the new one in there.

The CPU cooler was trickier. The tabs push through the motherboard and hook, but they are somewhat fragile. I was worried, and sure enough, the first time they weren’t tight. The cooler needs to be held tight to the CPU to pull off heat. It was loose and I got heat alarms once I started using the machine.

However I kept messing with it, slowly trying to get the tabs in there and eventually they are holding fairly tight. I still get heat alarms during heavy duty video processing, so I think this cheap $25 cooler isn’t good enough. Or I don’t have enough thermal grease. Either way, I’ll probably replace this soon.

Here’s everything together.

Photo Feb 13, 10 50 03 AM

And a happy Steve.

Photo Feb 13, 10 50 45 AM

Closer to the Heart

Most of us take whatever jobs we’re offered. We send out resumes, go through interviews, and often accept the first offer for employment. In fact, I bet most of us rarely have more than one employer to choose from at a time. Perhaps two if we’re looking for a job while still working, but I bet in many situations the current employer isn’t one we’re considering working for in the future.

What’s more, it seems that many people working in technology can work in a variety of industries. A database that stores sales information for a retail company isn’t managed a lot differently from one that captures inventory information for a manufacturer. I know I’ve worked in the power generation, education, finance, software, and other fields. I’ve found that my job as a DBA or developer hasn’t really changed much. I could work in any of these industries.

However, I’m sure that most of you have interests and attractions in areas. Perhaps some of you have really enjoyed working with certain data, or there are applications you think you’d enjoy. This week I wanted to ask what might be your ideal field.

In what industry, or with what data, would you like to work if you had the choice?

Or perhaps, where do you think you’d have a connection with the technology as well as the business. Imagine that you could choose what data you’d work with, and some employer in that field would offer you a job.

For me, I think if I had the chance I’d like to work with a professional sports team in the US. In Denver, I’d really be interested if the Broncos, Rockies, or Nuggets needed some technical help. These organizations are starting to use more and more data to analyze their players in order to gain some competitive advantage. I find all the numbers published to be interesting and think it would be a fun job helping to find patterns and meaning in the data.

Let us know this week where you might want to work.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.2MB) podcast or subscribe to the feed at iTunes and LibSyn.

tSQLt – SQLCop – Checking Naming Conventions

I’ve been using tSQLt a bit to do some testing and one of the things I’ve tested is standards for code. I’ve been using a framework on top of tSQLt called SQLCop. These are a series of tests written to look for specific things. One of the items I do check is for sp_ named procedures. I’ve mostly gotten out of the habit of doing this, preferring spProcName, but at times I make a mistake in typing. This catches those simple errors.

Using SQL Cop

You can Download the SQLCop tests and install them in your database after you’ve setup tSQLt. If you are using SQL Test, then you also get the SQLCop tests installed when you add the framework to a database. For me, I see the tests in the SSMS plugin.

tsqlt7

There are a lot of tests, but in this piece, I’ll look at the Stored Procedures Named sp_ test.

If I edit the test, I see it’s fairly simple code. I’ve included it here.

USE [EncryptionPrimer]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [SQLCop].[test Procedures Named SP_]
AS
BEGIN
-- Written by George Mastros
-- February 25, 2012
-- http://sqlcop.lessthandot.com
-- http://blogs.lessthandot.com/index.php/DataMgmt/DBProgramming/MSSQLServer/don-t-start-your-procedures-with-sp_

SET NOCOUNT ON

Declare @Output VarChar(max)
Set @Output = ''

SELECT @Output = @Output + SPECIFIC_SCHEMA + '.' + SPECIFIC_NAME + Char(13) + Char(10)
From INFORMATION_SCHEMA.ROUTINES
Where SPECIFIC_NAME COLLATE SQL_LATIN1_GENERAL_CP1_CI_AI LIKE 'sp[_]%'
And SPECIFIC_NAME COLLATE SQL_LATIN1_GENERAL_CP1_CI_AI NOT LIKE '%diagram%'
AND ROUTINE_SCHEMA <> 'tSQLt'
Order By SPECIFIC_SCHEMA,SPECIFIC_NAME

If @Output > ''
Begin
Set @Output = Char(13) + Char(10)
+ 'For more information: '
+ 'http://blogs.lessthandot.com/index.php/DataMgmt/DBProgramming/MSSQLServer/don-t-start-your-procedures-with-sp_'
+ Char(13) + Char(10)
+ Char(13) + Char(10)
+ @Output
EXEC tSQLt.Fail @Output
End
END;

This code looks at the meta data in the database for an routines, stored procedures, that start with sp_ as part of their name. If any results are returned from the query, the IF statement will be true and the @output will be returned as part of the tSQLt.Fail call.

Using the Test

Let’s write a stored procedure. If I do this:


CREATE PROCEDURE spLetsTestThis
AS
BEGIN

SELECT TOP 10
e.EmployeeID
, e.EmpTaxID
, e.FirstName
, e.lastname
, e.lastfour
, e.EmpIDSymKey
, e.EmpIDASymKey
, e.hashpartition
FROM
dbo.Employees AS e;

RETURN 0;
END;

GO

This is a simple procedure. I wrote it, execute it a few times and be sure it’s what I want. I’ve done basic testing, not let’s check it before I commit it to VCS.

The easy way to execute all the SQLCop tests is to right click them in SQL Test and execute them. I can also use T-SQL to run tests. However since I just want to show this one, I’ll right click it and select "Run Test".

tsqlt8

This runs the test selected. I can also run an entire class, or all tests, but clicking in the right spot. In this case, the test passes and I see a green mark.

tsqlt9

Now let’s write a new procedure:

CREATE PROCEDURE sp_GetArticles
AS
SELECT *
FROM dbo.Articles

GO

This is a bad procedure for a variety of reasons, but let’s execute my test. I see it fail, and a red mark appears next to my test.

tsqlt10

In this case I also get a window from SQL Test popping up with more details. This contains the output from the test, which is also inserted into a table by the tSQLt framework.

tsqlt11

Note that there is a URL with more information on this particular test. That is a part of the SQL Cop test code above. I could easily replace this with something particular to my environment if I chose.

At this point, I can rename the object, drop and recreate it, etc. to correct the issue. However running this test helps me to be sure I’ve gotten good code into the VCS. If I have this also run as a part of a CI process, it then prevents bad code from other developers appearing.

Meeting Standards

There are all sorts of SQLCop tests, and I’ll write about more, but this is an easy one to implement to prevent a bad practice in your coding by a team of developers. Allowing each developer to test themselves, as well as an overall check by some CI process means that our code quality improves.

If I have other standards, I can even write my own tests to enforce them, which I’ll do in another piece.

Downloads

Starting a New Job

I’ve had practice here, and was reminded of this when I ran across this thread on starting a new job. In my career, I’ve had far too many employers. I’d like to think it’s not my fault, but after having double digit jobs, I have to take some of the blame. I’ve often been looking for positions that I expected to last for many years, but something always seems to crop up. Even the jobs I’ve loved have ended sooner than I would have liked.

As a result, I’ve gotten used to starting new jobs, learning a new environment, and quickly getting familiar with people and processes. I thought of this when I ran across a thread that talked asked about the first few things you should do at a new job. There are some good answers, and it’s worth a read.

For those of you that start new jobs regularly, or are consultants, do you have a routine? Do you have scripts you bring along? When I saw sp_Blitz presented a few years ago, it seemed like exactly the type of thing you would want to have with you and go through on each instance you encounter. Something that gives you a good overview of how things look. While each environment is different and the configuration and settings will vary, it helps to have a view of the situation you’re familiar with. Even if you don’t change anything, at least you understand how the systems are set up.

There was a time when I had a complete evaluation routine similar to sp_Blitz and an install that would create jobs for backups and maintenance in a consistent way. I had more a few jobs switch to my way of managing things because it was very efficient. These days, I think I’d use tools that are tried and true, using sp_Blitz, Ola’s backup routine, and @SQLFool’s indexing routine. I wonder if the rest of you have similar preferences.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.4MB) podcast or subscribe to the feed at iTunes and LibSyn.

Detecting Encryption

I ran across an article recently from MSSQLTips by my friend, Brian Kelley. It talks about the ways you can detect encryption in use in your database. Brian’s approach, which is one I agree with, is that you can look for symmetric keys, asymmetric keys, and certificates in the system tables. The tables you query are:

  • sys.symmetric_keys
  • sys.asymmetric_keys
  • sys.certificates

That’s a good way to detect SQL Server encryption in use, but not encryption in general. One of the things I’ve advocated for applications that contain sensitive data and need to be protected from the DBA is to have the application create temporary keys or use .NET libraries to encrypt data. In that case, SQL Server just sees data, and doesn’t detect encryption.

Brian offers a solution that is to examine any columns containing these data types

  • binary
  • varbinary
  • image

That’s a good start, but how do you detect that this string is encrypted?

504b 0304 1400 0000 0800 1a86 4640 0d41 …

That’s actually not encrypted; it’s the start of a zip file. However it could be a jpg, a tiff or some other binary format. The only way I thought of was mirrored in this Stack Overflow note: you’d have to compare known file types and look for a pattern in a header of some sort that doesn’t match. It wouldn’t be sure you didn’t have encryption, but you might make some educated guess if no file type that might fit the data matches.

There was also a link in the comments to a Stack Exchange discussion on the same topic. It’s similar, though I saw the use of the KEY_NAME() function in there. I hadn’t used it, perhaps because of the poor documentation of encryption in SQL Server. I also found a KEY_ID() function that works similarly, returning the ID for the name of a key.

Ship Safe…Ship Often

The Red Gate tag line has been Ship Often…Ship Safe, which works great for developers. Make the changes to code as fast as you can and get them deployed. Keep things smooth (or safe) with a routine and a standard way of deploying code. Everyone wins, right?

When we were rehearsing the SQL in the City keynote last year, we came to this line in the script. I wanted it changed. As a DBA, I need to Ship Safe first. If I can do that, then I’m happy to Ship Often, as often as changes can be tested. I haven’t ever worried about changes being made too fast, as long as they can be made safely.

These two principles seem to be fundamentally at odds with one another. Certainly there is some tension and conflict, but if those pushing through changes can follow a well engineered process, then dangers shrink. If the people responsible for stability and availability accept the changes coming through a process, then they can accept them rapidly and more often.

Ultimately the closer we move to an engineering process, the more likely that we can deliver software at a higher quality level and quicker. However we need to define our engineering processes using the best practices and knowledge we have, and then stick to the the steps we’ve decided upon. Only then can we Ship Safe, and Ship Often.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 1.7MB) podcast or subscribe to the feed at iTunes and LibSyn.

Follow

Get every new post delivered to your Inbox.

Join 5,045 other followers