What Helps You Learn?

There was a post recently asking how to train a junior DBA. What tasks do you give them, how to get them productive, etc. The responses were interesting, but I was struck by one note from a person that is a junior DBA. This person mentioned a few things that have helped them get better at their job, some of which I hadn’t thought about before.

Since I try to teach people about SQL Server in writing, in presentations, and longer formats, I decided this might make a good poll question. If you spend a few minutes thinking about how you’ve improved your skills at your job, let us know what works best for you.

What things can others do to help you learn more about your job?

Are there specific things that have actually worked better for you? Perhaps you prefer to learn alone, but even then, what works well for you that builds skills. Is there a format or style of presentation that reaches you?

I’m a big believer in multiple voices and multiple ways to teach people. This is one reason we sometimes have 2-3 articles on the same topic from different authors. Each will reach different people, and speak to them in different ways. I think the format (text, video, live, etc. ) also has some impact on what helps different people learn. Let us know today what works best for you.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.1MB) podcast or subscribe to the feed at iTunes and Mevio.

Posted in Editorial | Tagged | 1 Comment

More Azure Database Creation in Powershell

I wrote about Azure SQL Database for T-SQL Tuesday #82, but I had a few mistakes in my process, at least from a cost standpoint. Since I’ll probably use this a bit more as I flesh out some CI/CD, I decided to document a bit more about the database creation process.

The Old Code

One of the problems I noticed right away was that I created an S2 level database in my post, which was too large. That wasn’t needed, and it might eat up my database credits. My original code was:

New-AzureRMSqlDatabase –ResourceGroupName “Default-SQL-WestEurope” –ServerName “dkranchapps” –DatabaseName “TSQLTuesday” –Edition Standard –RequestedServiceObjectiveName “S1”

In here, there are a few things that are obvious to me, and some that might not be. I won’t delve into detail on how to choose some of these, but I’ll define what they mean. There is a doc page for the cmdlet, which I used to check what I knew.

ResourceGroupName

This is the way you organize Azure resources. In my case, I have one resource group, which I named “Default-SQL-WestEurope”. This was the first group I made, intending this to be for SQL servers, and it’s in the West Europe region. Hence the name. 

ServerName

This is self explanatory, but you create servers to host resources in Azure. The one I created a long time ago for experiments was “dkranchapps”. I re-used it here, though I could have created another one.

DatabaseName

I’m not defining this. If you don’t know, stop reading.

Edition

This is the basic edition for your database. There are multiple options, but I used “Standard”, which was in the sample documentation code. However, a Standard incurs a charge from $0.0202/hr to $0.2016/hr. That doesn’t sound like much, but there are 720 hours in a 30 day month. That’s $14.44 to $145.15 a month. Again, not much, but this is per database. This could cause me issues with my $150/month credit.

ServiceObjectiveName

This is the scale within the Edition. I’m not sure why this the Edition is needed if we specify this here, but we can use a name or a value (GUID). In my case, I neglected to check the S2 syntax when I made the database. What I’d want for most things is a Basic or maybe an S level if this is short lived.

Other Options

There are other options. I can specify the subscription in a parameter, do a “WhatIf” to see what this would do, Force the command without user conformation, set a maximum size, and more. 

While you might not need these options, I’d encourage you to look through the list and make sure that it’s not your intention to omit something that might be important in your situation.

Creating a Basic Database

One of the things I want to try is scripting the creation of a database in my CI process. That’s not quite simple, and it’s a task for another day, but here’s a basic call that I can work on adding to my builds.

New-AzureRMSqlDatabase –ResourceGroupName “Default-SQL-WestEurope” –ServerName “dkranchapps” –DatabaseName “” –Edition Basic –RequestedServiceObjectiveName “Basic”

Once I run this, I can check the portal, and sure enough, I see my database at the correct size.

2016-09-21 10_52_49-SQL databases - Microsoft Azure

That’s the first step to automating some builds. Now I need to worry about security and API keys, but for now, I can build a database.

And, of course, remove it with Remove-AzureRMSqlDatabase.

Posted in Blog | Tagged , , | 3 Comments

Capture the Flag

I’ve read a lot of science fiction in my life. Quite a few books talk about cyberspace and the interactions of humans and software. In Snowcrash there are AI systems defending systems. The Unincorporated Man series has AI systems that not only have sentience, they hide their capabilities from humans. Numerous other examples exist, including movies such as The Matrix, which popularized system programs independently able to evaluate and react to security issues.

Science fiction is here, showcased recently at the DARPA Cyber Grand Challenge. You can watch a bit about the challenge or read a short article about what happened. Overall, it’s amazing. Seven systems ran various pieces of software that were targeted at finding vulnerabilities, patching their own and exploiting those of other systems. All without human intervention.

In a visualization, I picture a team of Agent Smiths (from the Matrix) seeking out rogue programs that seek to wreck havoc inside of the operating environment. Or maybe a series of automated developers tracking down bugs, such as buffer overflows and correcting the code to remove the issue.

Is this where static code analysis is heading? Automated repair, or maybe even real time repair? Could we actually have a software bot that might run through your Entity Framework (EF) application, tracking down SQL Injection vulnerabilities and correcting them? Even at a rudimentary level, I can imagine a bot that examines incorrectly parameterized queries and rewrites the code to properly manage the .NET code. Perhaps at some point we’ll even have bots that can understand queries from some application and build stored procedures on the fly that accept parameters, are called the next time a particular method is called, and can even grow to evolve with schema changes.

I know the simplistic ways in which we might build this code would have their own sorts of issues, but I’m hopeful that at some point we’ll have semi-intelligent bots that can fix code to ensure it’s secure and reliable, while keeping the creativity originally built by the human developer.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.2MB) podcast or subscribe to the feed at iTunes and Mevio .

Posted in Editorial | Tagged , | Leave a comment

Analyzing Data in Real Time

A few years ago I was listening to a presentation from Microsoft where they touted a customer that had real time analytics with Analysis Services. Since my exposure had been that cubes needed to be processed, a resource intensive process, I was wondering what this meant. Imagine my surprise to hear that the cube was being updated within 5 seconds of the changes in the OLTP system. Talk about real time.

Of course, the definition of real time isn’t really well known. We linked a piece in this week’s newsletter that looks at a survey of companies and their investment in real time, streaming, data movement pipelines. The definitions from companies about real time range from sub-second to 24 hours. I’m sure I think that 24 hours is real time, but if you’re used to processing cubes weekly, maybe that feels real time. I tend to see real time on the order of minutes, maybe a few tens of minutes. Once we get to hours, that feels more like batch processing, albeit fairly rapid batch work.

I am sure there are systems that need real time analysis, especially in some areas where decisions must be made quickly. As we move to more and more machine learning algorithms and automated intelligence, there will certainly be more call for real-time data movement. Yet another opportunity for data professionals, especially those that work with ETL pipelines. There will be tough problems, not only with moving data, but tracking lineage, recovering from issues, even weeding out bad data quickly.

However, that’s an opportunity for you, not me. I bet there are a few data professionals out there that feel the way I do. There are great challenges in solving real time problems, and building systems that can handle high volumes of data. The thing is, I value my sleep. I value not working all-nighters, I value a balance in my life. While I find the problems fascinating and the money involved tempting, I prefer to work on more pedestrian systems.

I’m glad there are people that want to work on very difficult problems, and I wish you all the best in taking advantage of these opportunities. I hope you’re well paid, and you have a great time building these impressive systems. I also hope to read about some of the amazing things you do, so please, share the knowledge where you can. This is a great, exciting time to work with data, and it’s one that I continue to enjoy every day.

Steve Jones

 

Posted in Editorial | Tagged , | Leave a comment

Restaurants At Summit

At the PASS Summit this year, there won’t be a Thursday night party. I think that’s fine, as it never seemed like a highly attended event, and the costs are high.

That means that Thursday night is open, and I know lots of you will be looking for something to do. Or you’ll know someone that’s looking for something to do. In either of those cases, Andy Warren and I are looking to try and organize a few things. One of these will be around dinner plans for attendees.

That’s where I’m looking for help from you. I need a list of restaurants in the downtown Seattle area.

We’re looking for recommendations from those of you that have eaten dinner in Seattle. Give us a name/address, cost ($ or $$ or $$$) and how busy it was. We’re looking for those places that are walking distance or short Uber/Lyft/cab from the Convention Center area.

Leave a comment with your recommendation and watch out for our announcement soon on the meetup plans.

Posted in Blog | Tagged | 16 Comments

Are You a Tech Company?

Are you a tech company. That’s an interesting question. I’m sure some of you might not think so, working for a commercial organization that has some physical product you sell or some service you provide. You might not consider yourself a tech company. However, your organization might be, at least according to this essay at Medium.

I certainly think software has changed, and will continue to change the world. I think data becomes is an important part of our system, but in conjunction with software and the ease with which we can gather, process, extract, and analyze the information inside of our data. In order to do that, I do believe that we need ways to better build and deploy software, whether that’s database software or application software.

To do that, can we be afraid of mistakes? I think not. The opening premise of the piece is looking at rapid delivery from two points of view. Can a system be secure if you change it multiple times a day? Or perhaps it can’t be secure if you don’t have the ability to change things. Either view could be true, and most of us lean one way or the other. My view is towards the latter, not because I want to release multiple times a day, but because if there are issues, bugs, or security holes, I need to be able to change my software that day. I can’t wait for weeks or months.

That’s one of the problems I’ve seen with quite a few systems in the past. The time to release a new version, to patch an issue, is far, far too long when potentially hundreds, or thousands, of malicious attackers could take advantage of a flaw this week. Or our company could lose customers because a feature is broken for long periods of time. While I don’t think the cost of fixing mistakes is zero, I do think that the cost of not fixing issues is also more than zero. Certainly at times the cost of not fixing something might be much, much higher than our organization would tolerate.

Mistakes will happen. However, the point the essay makes is valid. We should learn from mistakes, and not try to avoid them, but continually push forward to improve our software. Above all, we need to find ways to engage our employees and get them to believe they can produce better software and are empowered to do so.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.4MB) podcast or subscribe to the feed at iTunes and Mevio .

Posted in Editorial | Tagged , | Leave a comment

The 10X Programmer

Is there such a thing as a 10x programmer? It’s an interesting idea, and one that is discussed from a few perspectives in this Q&A. There’s some talk about how to define “x” as well as the idea that the environment and project might make a big difference in productivity. Certainly the baseline of the staff measured makes a difference. If most of your programmers aren’t very productive, then a more talented developer might easily outperform the average by a factor of 10. Depending on how you measure productivity.

I wonder, though, is there a 10x DBA or database developer? What would that look like? If we examined the effectiveness of a DBA managing systems, I have seen DBAs that managed 10x the absolute number of databases and instances. Working inefficiently, connecting the individual systems to apply updates or check statuses could easily make you ten times less efficient then the DBA making use of scripting and the ability to apply changes across many systems, not to mention managing them for issues.

In terms of database development, if we have some programmers that can produce code using 10x less resources than others, I’d consider that a 10x programmer. Given the state of questions I see asked on SQLServerCentral and other sites, it definitely seems that there are plenty of developers being paid to produce poorly written T-SQL code. We don’t have great definitions of how we can measure productivity, but there is plenty of code posted that can be dramatically optimized. It’s the rare week that I don’t see someone noting they’ve reduced the runtime of some code from hours to minutes. Isn’t that 10x more productive?

Ultimately one of the amazing things about software is that we aren’t bound to a particular structure once it’s created and used. We can quickly, and easily, change the way the code is built so that it works differently, does more, runs efficiently, or really anything we want. Of course, testing that the code produces the desired effects and deploying that to different environments in a way that minimally impacts clients is a challenge, but a challenge many in our industry work on constantly, seeking improvement.

I think there are definitely 10x programmers, but that really depends on your situation and how you utilize the people working for you.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 4.2MB) podcast or subscribe to the feed at iTunes and Mevio.

Posted in Editorial | Tagged | 2 Comments

Why We Need Backups First

One of the things I show when I talk about TDE is that the backup of the certificate is very important. When you create a certificate, SQL Server warns you right away that a backup of the certificate is needed. Following Murphy’s Law, if you continue on with enabling TDE, right after the encryption is completed, your master database will become corrupt and you won’t be able to get to your data. It just takes a few seconds, so back up your certificates.
I was reminded of this while reading a short piece on some police body camera footage being lost. I don’t want to focus on the police actions or cameras or anything but this statement: ” A backup system had been purchased but hadn’t been set up. “
That’s a fairly egregious breach of duty for some IT personnel in this case. The system was installed in 2009, and this event apparently occurred during a software upgrade in 2012. While I understand a backup system might not get configured immediately, I think 3 years stretches the limit of an acceptable delay. However, even if the backup system were not in place at the time of the software upgrade, wouldn’t a backup before upgrade be required? Isn’t that a standard practice?
There have been other reports of similar issues, which is disconcerting. It appears the use of technology for archival of data isn’t well developed, or properly understood by the IT staff in charge of the process. Of course, this happens in plenty of enterprises and other organizations where IT staff seem to think that backups are important in and of themselves. In case you’re confused, they are not. Restores are the important things to have, which means you must verify your backup process allows the data to be retrieved.
Many of us have scripted methods that practice this regularly, refreshing other environments, restoring databases for DBCC, or some other task. Few of us actually practice this ourselves across all databases on a regular basis, ensuring our skills for tail log backups, NORECOVERY and STANDBY restores are fresh and ready for emergencies. However, we should not take this task lightly. It is important that we are able to recover data in almost all situations.
If we can’t do that, then we are truly disregarding our duty as the custodians of data.
Steve Jones
Posted in Editorial | Tagged , | Leave a comment

The Danger of xp_cmdshell

Securing a computer is a challenge. There are all sorts of potential issues in every platform, and ensuring safety for your data can be less a reflection of your ability and more the good fortune there isn’t a focused effort to attack your systems. However, we certainly also face issues with inside users, many of which may make mistakes that are accidental more than malicious. It’s for these reasons that we look for secure by default applications and a reduced surface area for any system.

Many people refuse to turn on xp_cmdshell as an option for scripting in SQL Server. This is disabled by default, and quite a few DBAs are glad of this setting. However, there are plenty of people that think xp_cmdshell isn’t a big security risk. There are certainly ways to mitigate the usage by non-privileged users, and this can be a tool that is very handy for accomplishing work without a lot of development time.

This week, as security issues become more important to us all, I’m curious how you feel.

Do you think xp_cmdshell is dangerous?

I have to admit that I’m torn. I don’t think this inherently dangerous. It does open up some attack vectors, but the last few versions of SQL Server have allowed some limitations, so I would enable this if needed to solve some issues without too many concerns. However, I wonder if many of you feel the same way.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 2.2MB) podcast or subscribe to the feed at iTunes and Mevio .

Posted in Editorial | Tagged , | 4 Comments

Data Science Education

One of the good skills to have for a data professional is how to analyze data. Most of us could learn more about data science and data analysis for some aspect of our jobs. We are data professionals, so we should understand how to analyze data. I’d expect that a competent data professional would be able to put together a report on some set of data that means something to an end user. If we administer systems, then analyzing usage (index, space, etc) is a skill we need. If we write code, sooner or later we’re going to write some report for a client. Either way, we need to perform some sort of analysis.

How can you learn more? There are lots of resources available. I thought it was interesting that Microsoft has teamed up with edX for their own data science degree. While there are mixed feelings on this, I think edX has a good platform and strong partnerships for teaching. There are other places, such as Coursera, that are doing the same thing, offering a variety of courses online.

In fact, it appears that quite a few educational institutions and businesses are starting to increase their data science related offerings. There’s a good summary of some of the options in a piece from Dataversity. In response to all the demand, or at least perceived demand, there is everything from a boot camp getting you up to speed on some quick analytic techniques and tools to full degree programs. Some programs have different levels, depending on the amount of statistical knowledge you have.

Learning something about statistics is probably the best way for most people that would like to get started. I’ve seen quite a few people start learning by reviewing some statistics techniques. From Buck to Mala, putting your own learning down in a blog will force you to ensure that you actually understand the principles. As for getting started, there are lots of great books you can buy, or read information online. One note, be careful and double check the information written on random blogs. Or check out places like the Khan Academy.

In some sense this reminds me to the rush to get a CNE in the early 90s, or the many boot camps that appeared to help people a MCSE certification in the mid to late 90s. Lots of people passed the tests, but weren’t very qualified, which has contributed to the general distrust of certifications today. I hope that doesn’t happen too much today, though I’m sure it will to some extent. The chance to make more money, or just find a job, will drive lots of people to look for quick wins.

While I’m sure some of you will get better jobs, perhaps making a lot of money with minimal data science experience, I hope many of you continue to learn and improve your skills in the data analysis area over time, whether you are paid to be a data scientist or not.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.5MB) podcast or subscribe to the feed at iTunes and Mevio .

Posted in Editorial | Tagged , , | Leave a comment