Friday, February 18, 2011

Sql Server Transactions - ADO.NET 2.0 - Commit and Rollback - Using Statement - IDisposable

My copy of Pro ADO.NET 2.0 by Apress showed up the other day.  I am not ready to give my final review of it, but it seems to be a solid book.  Most of ADO.NET has not changed with the release of ADO.NET 2.0, so when you buy a new book about ADO.NET 2.0 don't expect a wealth of new information.
Anywhoo, I thought I would share some basic information about SQL Server Transactions for those who are new to ADO.NET.  The book does a good job of providing an overview of transactions and a basic template for how to execute local transactions in code.  Here is the basic template for an ADO.NET 2.0 Transaction:
using (SqlConnection connection =
            new SqlConnection(connectionString))
{
    SqlCommand command = connection.CreateCommand();
    SqlTransaction transaction = null;
    
    try
    {
        // BeginTransaction() Requires Open Connection
        connection.Open();
        
        transaction = connection.BeginTransaction();
        
        // Assign Transaction to Command
        command.Transaction = transaction;
        
        // Execute 1st Command
        command.CommandText = "Insert ...";
        command.ExecuteNonQuery();
        
        // Execute 2nd Command
        command.CommandText = "Update...";
        command.ExecuteNonQuery();
        
        transaction.Commit();
    }
    catch
    {
        transaction.Rollback();
        throw;
    }
    finally
    {
        connection.Close();
    }
}
A c# using statement wraps up the connection, because SqlConnection implements IDisposable.  The using statement makes sure that Dispose() gets called on the connection object so it can free up any unmanaged resources.
Before you can begin a transaction, you must first open the connection.  You begin your transaction and then assign any newly created command objects to that transaction and perform queries as necessary.  Commit the transaction.  If an error occurs, Rollback the transaction in a catch statement to void out any changes and then rethrow the error so that the application can deal with it accordingly.  The connection is properly closed in the finally statement, which gets called no matter what, and any unmanaged resources are disposed when the using statement calls Dispose() on the connection.  Pretty simple solution to a fairly advanced topic.
The above template could actually implement a second c# using statement around command, because SqlCommand also implements IDisposable.  I don't know that it is really necessary, however.  More theoretical than probably anything.  I just like to see using statements around anything that implements IDisposable:
using (SqlConnection connection =
            new SqlConnection(connectionString))
{
    using (SqlCommand command =
            connection.CreateCommand())
    {
        SqlTransaction transaction = null;
        
        try
        {
            // BeginTransaction() Requires Open Connection
            connection.Open();
            
            transaction = connection.BeginTransaction();
            
            // Assign Transaction to Command
            command.Transaction = transaction;
            
            // Execute 1st Command
            command.CommandText = "Insert ...";
            command.ExecuteNonQuery();
            
            // Execute 2nd Command
            command.CommandText = "Update...";
            command.ExecuteNonQuery();
            
            transaction.Commit();
        }
        catch
        {
            transaction.Rollback();
            throw;
        }
        finally
        {
            connection.Close();
        }
    }

}
 
For more information on the c# using 
statement and IDisposable, see  

C# Code Review - C# Using Statement - Try / Finally - IDisposable - Dispose() - SqlConnection - SqlCommand

While reviewing some C# code written by a newbie programmer and buddy of mine, I noticed a lack of calling Dispose() on SqlConnection and SqlCommand objects.  And in all cases, the database code was not placed in try / finally blocks.  This is typical newbie style of development that everyone, including myself, attempted in the very beginning.
SqlConnection cn = new SqlConnection(connectionString);
SqlCommand cm = new SqlCommand(commandString, cn);
cn.Open();
cm.ExecuteNonQuery();
cn.Close();


It sure is easy to follow :)
The problem is that SqlConnection and SqlCommand implement IDisposable, which means they could have unmanaged resources to cleanup and it is our job, the developers, to make sure Dispose() gets called on these classes after we are finished with them.  And, because an exception could be raised if the database is unavailable ( a very real possibility on WebHost4Life :) ), we need to make sure Dispose() gets called even in the case of an exception.
Personally, I like the “using” keyword in C#.  Internally, this bad boy generates a try / finally around the object being allocated and calls Dispose() for you.  It saves you the hassle of manually creating the try / finally block and calling Dispose().
The new code would looking something like this:
using (SqlConnection cn = new SqlConnection(connectionString))
{
    using (SqlCommand cm = new SqlCommand(commandString, cn))
    {
        cn.Open();
        cm.ExecuteNonQuery();
    }
}

This is essentially equivalent to the following, although my guess is that C# will internally generate two try / finally blocks (one for the SqlConnection and one for the SqlCommand), but you get the idea:

SqlConnection cn =  null;
SqlCommand cm = null;

try
{
    cn = new SqlConnection(connectionString);
    cm = new SqlCommand(commandString, cn);
    cn.Open();
    cm.ExecuteNonQuery();
}
finally
{
    if (null != cm);
        cm.Dispose();
    if (null != cn)
        cn.Dispose();
}

You may notice the lack of calling Close() on the SqlConnection class, cn.  Internally, Dispose() checks the status of the connection and closes it for you.  Therefore, technically you don't need to call Close() on the connection (cn) as Dispose() will do it for you.  However, I don't think there is any penalty for calling Close() directly and then Dispose(), but I don't know for sure.  I doubt it.
In addition, Dispose() destroys the connection string of the SqlConnection class.  Therefore, if you want to re-open the connection after calling Dispose() on cn, you will have to re-establish the connection string.  Not doing so will throw an exception.

For more information on handling concurrency in multi-user ASP.NET applications, see

SQL Server - Optimistic Concurrency Database Updating - Pessimistic Concurrency - High Performance ASP.NET Websites

I left out my beloved friend, Mr. Optimistic Concurrency, in my series of articles on High Performance ASP.NET Websites Made Easy.

The Problem

Most ASP.NET applications incorporate a database as a repository for data. At some point in the life of your application, a user of that application will need to modify a record in that database. As soon as the person selects a record from the database to be modified, the data selected is old and suspect.  It is old and suspect, because the acquired data is now disconnected from the database and no longer accurately represents the original data in the database. The original database record may have been modified before the user is able to modify and save his/her changes.
So now we have the age old problem, how confident do we feel as developers that the selected data by one user will not be changed by another user before he/she has a chance to modify and save it back to the database?  If we decide poorly, one user ends up overwriting another user's changes or causing excessive delays in the performance of a multi-user ASP.NET application due to record locking.

Pessimistic and Optimistic Concurrency

In a multiuser environment, there are two models for updating data in a database: optimistic concurrency and pessimistic concurrency. Pessismistic concurrency involves locking the data at the database when you read it.  You essentially lock the database record and don't allow anyone to touch it until you are done modifying and saving it back to the database.  Here you have 100% assurance that nobody will modify the record while you have it checked out.  Another person will have to wait until you have made your changes.
Optimistic concurrency means you read the database record, but don't lock it.  Anyone can read and modify the record at anytime and you will take your chances that the record is not modified by someone else before you have a chance to modify and save it.  As a developer, the burden is on you to check for changes in the original data ( collisions ) and act accordingly based on any errors that may occur during the update.
Depending on the application and the number of users, pessimistic concurrency can cause delays in your application.  Other users may have to wait while another user has locked a database record.  Worst case, you get an actual dead lock, because there is a cyclical dependency on database resources and nobody can complete his/her intended task.  A timeout occurs in the application and nobody is happy, including the guy responsible for cutting you a check.
With optimistic concurrency, the application has to check for changes to the original record to avoid overwriting changes.  There is no guarantee that the original record has not been changed, because no lock has been placed on that data at its source - the database.  Hence, there is the real possibility of losing changes made by another person.

Simplest Thing Possible, But No Simpler

The key is to know your application and to pick a solution that best meets its needs.
If your application is a portal for your INETA .NET Developer Group Website, I dare say that there is absolutely no chance that two people will be updating content at any time. ;)  You are lucky if you get one person to update the website.  Hence, I would say this is very much a single-user environment and the heck with concurrency issues - they won't happen.  The chance of overwriting someone else's data is slim to none as well as has very little repercussions if it happens.
If your application is a banking application and the repercussions of overwriting data is severe, concurrency issues will be a big concern and need to be judged accordingly.  Although the chance of 2 users / processes modifying the same data may be slim to none in the application, the risk of it happening once may be important enough to use pessimistic concurrency and the heck with performance on the off chance that two processes may want to access the same data.

Optimistic Concurrency Strategies

If you are in a performance state-of-mind, chances are you will go with optimistic concurrency.  Optimistic concurrency frees up database resources as quickly as possible so that other users and processes can act upon that data as soon as possible.
To the best of my knowledge, there are four popular strategies to dealing with optimistic concurrency:
  1. Do Nothing.
  2. Check for changes to all fields during update.
  3. Check for changes to modified fields during update.
  4. Check for changes to timestamp ( rowversion ) during update.
All of these strategies have to deal with the shaping of the Update T-SQL Command sent to the database during the updating of the data.  The examples below are not very detailed on purpose and assume a basic understanding of ADO.NET.  Below shows the strategies from a view point of 30,000 ft high.

Optimistic Concurrency on Update Strategy #1 - Do Nothing

The simplest strategy for dealing with concurrency issues during the updating of data is to do nothing.
The update command will not check for any changes in the data, only specify the primary key of the record to be changed.  If someone else changed the data, those changes will more than likely be overwritten:
Update Product
    Set
        Name = @Name    Where
        ID = @ID
One would hope that this means either 1) the application is a single-user application, or 2) the chance of multi-user update collisions is very unlikely and the repercussions of overwriting data is negligible.

Optimistic Concurrency on Update Strategy #2 - Check All Fields

With this strategy, the update command will check that all fields in the row ( usually minus BLOB fields ) are equal to their original values when peforming the update to assure no changes have been made to the original record.  A check of the return value of the ExecuteNonQuery Command will tell you if the update actually took place.  The return value of the ExecuteNonQuery Command is typically the number of rows affected by the query.
Update Product
    Set
        Name = @Name,    Where
        ID = @ID
      AND
        Name = @OriginalName
      AND
        Price = @OriginalPrice
This is essentially what CommandBuilder creates when using DataSets and is a strategy that doesn't want to see any changes to the data.

Optimistic Concurrency on Update Strategy #3 - Check Only Changed Fields

Rather than checking all fields in the row to make sure they match their original value, this strategy checks only those fields that are being updated in the command.
Update Product
    Set
        Name = @Name
    Where
        ID = @ID
      AND
        Name = @OriginalName
This strategy only cares that it is not overwriting any data and could care less that other fields in the record may have been changed. This could create an interesting combination of data in the row.

Optimistic Concurrency on Update Strategy #4 - Implement Timestamp

SQL Server has a timestamp ( alias rowversion ) field that is modified everytime a change is made to a record that contains such a field.  Therefore, if you add such a field to a table you only have to verify the timestamp record contains the same original value to be assured none of the fields have been changed in the record.
Update Product
    Set
        Name = @Name
    Where
        ID = @ID
      AND
        TimestampID = @TimestampID
This is the same as Strategy #2 above without the need for checking all fields.

Conclusion

Optimistic concurrency has a performance component to it that suggests a higher performing ASP.NET website, so I included it in my series of posts called High Performance ASP.NET Websites Made Easy.  Yeah, I could be pushing that statement a bit ;)
There are other methods of achieving optimistic concurrency, but I think the ones above are the most popular.  A developer needs to look at the application itself to determine which strategy makes sense.  The DataSet, Command Builder, and DataAdapter typically handle this stuff for you using Strategy #2.  However, if you work with objects instead of DataSets, you need to handle concurrency issues yourself.  If you use an O/R Mapper, make sure you know how your O/R Mapper handles the possibility of collisions during updates.




 

No comments:

Post a Comment