The Voyage for an Auto Deploying Database solution

I recently undertook the task of setting up a new development environment complete with Source Control, Continuous Integration, Scrum management, and Auto Deployment to a Test Environment. I’ve had experience in the past with this sort of solution using Microsoft software suites like MSBUILD, MSDeploy, TFS2010 and custom build steps in WF. This time, however, SVN is our source control software and it doesn’t come bundled with all the other things TFS gives you.

Hearing a lot about JetBrains’ TeamCity over the years, I was naturally pulled toward it as the starting point. I quickly discovered a fantastic 5 part series of posts Troy Hunt made on his blog, called “You’re deploying it wrong! TeamCity, Subversion & Web Deploy” over at http://www.troyhunt.com/2010/11/you-deploying-it-wrong-teamcity.html. I can’t recommend his blogs enough! In a very short time, our development network was setup with TeamCity which did CI builds upon SVN checkin and also auto deployed our web applications upon successful CI. The only thing lacking now was getting the Database to update as well, which turned out to be a whole different kettle of fish.

So I reached out to Google in search of accepted or recommended approaches for automating database updates. My original plan was to get TeamCity to use all the script files it finds as artifacts it could then deploy to our SQL Server databases. After quite a bit of investigating, the same few options kept coming up over and over. One favourite amoung others seems to be integrating a number of Red Gate’s solutions into the team city process. Good old Troy Hunt even had another post up how to do just that, called “Automated database releases with TeamCity and Red Gate” over at http://www.troyhunt.com/2011/02/automated-database-releases-with.html. The thing is, it went against the way we do script roll outs at our company, so I was apprehensive of changing our ways and kept on searching for alternatives. I also posted a question to the great community of stackoverflow seeking other alternatives I could consider before getting stuck in (http://stackoverflow.com/questions/18589906/autodeploy-sql-changes-with-teamcity-alternative-to-redgate). EF Migration was the answer there, but again it wasn’t really what I was looking for.

Our current standards for .NET Solution design is to include folder within our SVN repository which contains all our change scripts ordered by version and run order. It’s this structure I wanted to maintain that caused me to seek other solutions to the well documented ones mentioned earlier.

MySolution
 > MySolution.Data
    > 3.0 Scripts
       * 01 - UpdateUserTable.sql
       * 02 - UpdateRoleTable.sql
    > 3.1 Scripts
       * 01 - CreateJobTable.sql
       * 02 - InsertJobTypes.sql
       * 03 - AndSoOn.sql

I had come across MSDeploy a few times as a way to deploy scripts to a server, using dbFullSql as detailed here http://blogs.iis.net/msdeploy/archive/2008/11/10/the-msdeploy-sql-database-provider-dbfullsql.aspx. I liked the sound of this, so my next step was to determine and get the SQL Scripts required for an update. My knowledge of what TeamCity can and cannot do is still limited. I thought getting TeamCity to grab sql files from a svn checkin would be trivial, but after a bit of investigating… I began to wonder if that was in fact the safest way to go about it. My main concern was “What if someone checks in scripts that don’t work?”. For this reason I decided the best thing to do was to create my own giant update script file complete with a transaction so that it could be rolled back if anything went wrong. I briefly considered getting this working with PowerShell but then said “Hey! I’m a developer, why don’t I make my own app?!”.

Console App Version 1 – Mega-script maker

So instead of putting TeamCity in charge of getting sql files, I’ll take over and do it myself. The first thing I’ll need to be able to do is get files from SVN. After a quick Google, SharpSVN becomes the answer (http://sharpsvn.open.collab.net/). It seems to be the api which AnkhSVN 2.X uses to integrate Visual Studio with SVN, so I have good confidence in it. After a quick bit of fiddling around, I am able to make requests to my SVN server and pull down revisions within a certain range, and stream them into strings to construct my new mega script!

using (var client = new SvnClient())
{
    var targetScriptsUri = new Uri(arguments.TargetRootPath + arguments.TargetScriptsPath);
    var revisionScriptChanges = new List();

In order to specify SVN credentials, one needs to first clear the current credentials, then set the default credentials as needed.

    client.Authentication.Clear();
    client.Authentication.DefaultCredentials = new System.Net.NetworkCredential(arguments.SvnUsername, arguments.SvnPassword);

    client.Log(targetScriptsUri, new SvnLogArgs { Range = new SvnRevisionRange(arguments.RevisionFrom, arguments.RevisionTo) },
                (sender, eventArgs) =>
                    {
                        foreach (var changedPath in eventArgs.ChangedPaths)
                        {

We are only interested in SQL files, so I filter based on files ending with .sql. Yes it is possible that somebody added a different extension with script in it, but we have to draw the line somewhere with standards. You’ll notice I also get items for every SVN Action. This is so that I can then determine whether to include it in the update or not, and if the same file was changed many times, which one should be run.

                            if (changedPath.Path.ToLower().EndsWith(".sql"))
                            {
                                var scriptPath = arguments.TargetRootPath + changedPath.Path;
                                var revisionInfo = new RevisionInfo
                                    {
                                        ScriptPath = scriptPath,
                                        Revision = eventArgs.Revision,
                                        ChangedItem = changedPath,
                                        Author = eventArgs.Author
                                    };
                                revisionScriptChanges.Add(revisionInfo);
                            }
                        }
                    });

Now that we have all the changed items information, I created another list. This is a list of actual scripts to run. Ordering all the changes first by name, then by revision I am able to get a logical order of execution. We loop through all the changed items in the now correct order and depending on their SVN Action, add/remove/replace them into our list of scripts to run.

    var revisionScriptsToRun = new List();

    revisionScriptChanges = revisionScriptChanges.OrderBy(r => r.ScriptPath).ThenBy(r => r.Revision).ToList();
    foreach (var revisionInfo in revisionScriptChanges)
    {
        switch (revisionInfo.ChangedItem.Action)
        {
            case SvnChangeAction.Modify:
            case SvnChangeAction.Replace:
                var itemToReplace =
                    revisionScriptsToRun.FirstOrDefault(
                        r => r.ChangedItem.Path == revisionInfo.ChangedItem.Path);
                if (itemToReplace != null)
                    revisionScriptsToRun.Remove(itemToReplace);
                revisionScriptsToRun.Add(revisionInfo);
                break;
            case SvnChangeAction.Delete:
                var itemToDelete =
                    revisionScriptsToRun.FirstOrDefault(
                        r => r.ChangedItem.Path == revisionInfo.ChangedItem.Path);
                if (itemToDelete != null) revisionScriptsToRun.Remove(itemToDelete);
                break;
            case SvnChangeAction.None:
                // Do Nothing
                break;
            case SvnChangeAction.Add:
                revisionScriptsToRun.Add(revisionInfo);
                break;
            default:
                throw new ArgumentOutOfRangeException();
        }
    }

We finally know what files we need and what order to run them in, all that’s left is to read the contents of the files into memory and then into strings so that we can go ahead and process them.

    var fullScriptInfoList = new List();
    foreach (var scriptToRun in revisionScriptsToRun)
    {
        var svnFileTarget = new SvnUriTarget(new Uri(scriptToRun.ScriptPath), scriptToRun.Revision);

        using (var stream = new MemoryStream())
        {
            client.Write(svnFileTarget, stream);
            var streamReader = new StreamReader(stream, true);
            stream.Position = 0;
            var fileContent = streamReader.ReadToEnd();

            fullScriptInfoList.Add(new FullScriptInfo
            {
                SvnAction = scriptToRun.ChangedItem.Action.ToString(),
                SvnAuthor = scriptToRun.Author,
                SvnPath = scriptToRun.ChangedItem.Path,
                SvnRevision = scriptToRun.Revision,
                Script = fileContent
            });
        }
    }
    result.FullScriptInfoList = fullScriptInfoList;
}

I spend quite a bit of time trying to ensure this is a fool proof solution, so the script file is constructed within a TRY block with a Transaction and a Rollback if any errors are caught, as follows:

BEGIN TRY
   BEGIN TRANSACTION
   -- write each script file pulled from svn's contents
   COMMIT TRANSACTION
END TRY
BEGIN CATCH
   IF @@TRANCOUNT > 0
      ROLLBACK TRAN

   RAISEERROR(ERROR_MESSAGE(), ERROR_SEVERITY(), 1)
END CATCH

--BUT IT DOESN'T WORK WITH GOs!!

The console app was looking perfect. It would take in arguments for: SvnRootPath, SvnSqlScriptsPath, SvnUsername, SvnPassword, SvnRevisionFrom, SvnRevisionTo, OutputFilePath and was able to query svn for the revisions requested and spit out mega-script, complete with nice sql comments throughout to make readability through very easy.

Now the problems. Mega-script looked great, but SQL Server Management Studio failed to parse the script. To me, everything looked good. What I didn’t realize is that GO messes everything up. GO is SQL Server Management Studio’s batch separator. Most if not all of our scripts contain GOs throughout. This is a problem because my TRY block is then messed up. The GOs inside the TRY block indicate the end of each batch, but without the end of the TRY, the scrip isn’t valid! This is a big problem for me as I want it to be a super database update process which rolls back EVERYTHING is an error occurs anywhere. I don’t consider stripping out the GO’s in .NET before writing them to the mega-script file. So I look for other safe alternatives. Tossing the TRY CATCH blocks, the script now parses. I turn to transactions and nested transactions to attempt to safely run script after script and rollback on errors. It parses, but GO gets in the way here too. Ultimately, it doesn’t work as intended and allows for a mixture things getting run or rolled back leaving the database is a Frankenstein of a state, not knowing what version it really is. Time for another U-Turn.

Console App Version 2 – Direct Execution

At this point I decide to give up on the idea of creating a sql script file and using MSDeploy to run it (without even getting as far as seeing if that was even possible in the first place). I had now decided to yet again, “do it myself”. The new direction of the Console App is to execute the scripts directly to the target database. It’s been a while since I’ve used ADO.NET directly through SqlCommands and the like (we’re very spoilt these days with EF), but I head down the old path none the less knowing that it too can execute transactions.

using (var connection = new SqlConnection(request.ConnectionString))
{
    connection.Open();
    var command = connection.CreateCommand();

    // Start a local transaction
    var transaction = connection.BeginTransaction("Transaction");

    // Must assign both transaction object and connection to Command object for a pending local transaction
    command.Connection = connection;
    command.Transaction = transaction;

    try
    {
        foreach (var revisionScript in request.FullScriptInfoList)
        {

            // Find Current Log Record
            var logRecord = result.ScriptResults.Single(r => r.ScriptName == revisionScript.SvnPath);

GO again is a problem, as natively it is not a sql command (http://stackoverflow.com/questions/40814/how-do-i-execute-a-large-sql-script-with-go-commands-from-c), so I write a sql batch parser which builds command text line by line and executes it when it reaches a GO. GO can only appear on its own line, so it’s quite easy to identify.

            // Parse scripts line by line into SQL batches. Execute each batch when a GO statement is encountered.
            var sqlBatchBuilder = new StringBuilder();
            var sqlScriptLines = revisionScript.Script.Split(new[] { '\n', '\r' });

            try
            {
                foreach (var sqlScriptLine in sqlScriptLines)
                {
                    if (sqlScriptLine.ToUpperInvariant().Trim() != "GO")
                    {
                        sqlBatchBuilder.AppendLine(sqlScriptLine);
                    }
                    else
                    {
                        command.CommandText = sqlBatchBuilder.ToString();
                        if(!string.IsNullOrWhiteSpace(command.CommandText)) command.ExecuteNonQuery();
                        sqlBatchBuilder.Length = 0;
                    }
                }

                // Execute any remaining scripts
                command.CommandText = sqlBatchBuilder.ToString();
                if (!string.IsNullOrWhiteSpace(command.CommandText)) command.ExecuteNonQuery();
            }
            catch
            {
                logRecord.ResultType = ResultType.Fail;
                logRecord.ResultMessage = "Script execution failed";
                throw;
            }

            logRecord.ResultType = ResultType.Success;
            logRecord.ResultMessage = "Script executed successfully";
        }

        // After all scripts have been batched and executed, attempt to commit the transaction
        transaction.Commit();
        result.ResultType = ResultType.Success;
        result.ResultMessage = "All scripts executed successfully";

        return result;
    }
    catch (Exception commitException)
    {
        // Attempt to roll back the transaction
        try
        {
            transaction.Rollback();

            result.ResultType = ResultType.Fail;
            result.ResultMessage = "Script execution failed and transaction rollback successful";

            return result;
        }
        catch (Exception rollbackException)
        {
            // This catch block will handle any errors that may have occurred
            // on the server that would cause the rollback to fails, such as
            // a closed connection
            result.ResultType = ResultType.CriticalFail;
            result.ResultMessage = "Script execution failed and transaction rollback failed!";
            return result;
        }
    }
}

I end up with a nice solution which loops through all my scripts, and for each script batch process it’s parts one by one. The loop sits inside a try block and if any errors do occur during the process, I catch them and roll the transaction back with ADO.NET. Ready to rock and roll… almost.

Console App Version 3 – Keeping track of your version

The app now is able to pull scripts from SVN and execute them directly to the SQL Server, and rollback changes if any errors occur. It’s still not fool proof though, as I could run the same updates multiple times and possibly run into problems there. For this reason I add a couple of tables to the target database which are in charge of versioning.

dbversion

Every time an update is run DatabaseUpdateVersionLog will have a recorded added specifying the arguments that were passed to it. It also importantly saves the Result of the update. Success for success. Fail for failure and rollback. CriticalFail for failure and rollback failure (ie. something bad happened!!!).

For every script in an Update, a record is added to DatabaseUpdateScriptLog, which contains more specific information about the script file, and more importantly, its result status. Success for success, Fail for failure, Skip for when the update didn’t even get this far before failure.

For fun (I do like to torment myself), I decided to do this part of the database handling with EF, so created my own EF Model of a database containing these three tables, with the intent of using the same connection string passed in through the arguments for the entity connection. Of course it is not so simple, as it turns out Entity Framework’s connection string is different to a standard connection string and contains information such as metadata and provider. To get around this, I use a EntityConnectionStringBuilder to build an entity connectionstring and then update it’s provider connection string with our target db’s

private static string ConvertToEnityConnectionString(string databaseConnectionString)
{
   var connectionStringSettings = ConfigurationManager.ConnectionStrings["DatabaseEntities"];

   var entityBuilder = new EntityConnectionStringBuilder();
   entityBuilder.Provider = connectionStringSettings.ProviderName;
   entityBuilder.ConnectionString = connectionStringSettings.ConnectionString;
   entityBuilder.ProviderConnectionString = databaseConnectionString;

   return entityBuilder.ToString();
}

I’m now also able to keep track of what version the database has been upgraded to by getting the Max DatabaseUpdateVersionLog.SVNRevisionTo where ResultType equals success. I use this to validate whether a requested update is valid. And of course, it is a nice verbose log of events from the update if ever you wanted to investigate what has been going on. I’m very happy with this now, and it should now be integratable with our TeamCity auto deployments, but what about when we want to deploy to our UAT or Live sites which for security reasons, sit on a different network all together…

Console App Version 4 – Offline deployment

Well, it’s already doing everything it needs to do, it just needs to be able to do the steps at seperate times. So I introduce a couple of different modes of opperation to the console app. Direct is the mode already Implemented. Export and Import are the new modes.

  • Export is used to get scripts from SVN within a revision range. It then saves them to an XML file. It does not care about the target db at all.
  • Import is used to import an XML Data file and apply updates within it to a target database. It cares about the version compatibility but has nothing to do with SVN any more.

Implementation is simple enough. We reuse the method used to query SVN and build the same list of scripts to run for a particular revision range. I then use .NET 4’s XDocument to create an xml document with the request information in it, as well as each of the scripts content and svn information. This is saved to an XML file for later use. Import mode likewise reads this xml document transforming the content into the same request object I use to run the updates which it then calls. Voilà!

TeamCity Integration – The final annoyance

The very last thing left to do now is to integrate this with TeamCity. Easy, right? Well, yes… but not without a final annoyance. The console app is now sitting on the build server. I add a Build Step to my configuration and set the Run option to “use executable with parameters”, as that seems logical. Turns out that TeamCity has a bit of trouble coping with double quotes in the parameters section and sends the parameters incorrectly to the console app. This is a problem as the database connection string is passed in as an argument, the argument contains spaces which would normally cause .NET to split them up into separate args unless you encapsulate the argument with double quotes. But luckily, there is another Run option in TeamCity called “Custom script”, which does the job just fine.

teamcitystepcommand

Now team city is setup to run my Update Database build upon successful Deployment. All of the helpful console output appears in TeamCity’s log so we can always keep an eye on whats going on, or what going wrong.

teamcitybuilds

teamlog1

29 scripts later…teamlog2

That was the Voyage. And here is the code!

3 thoughts on “The Voyage for an Auto Deploying Database solution

  1. Awesome post about a very thorny issue that continues to thwart developers, even in this age of automate-everything. Do you still plan to post the code for public consumption? It certainly would be a great starting point for those of us going down that dark, seedy path.

Leave a comment