Reading stuctured files into SQL Server Part 2

My last post presented how you can read a file in a structured format into memory for further processing.

This post will focus on how you easily can transport the contents you just imported into SQL server.

If you want to data in bulk into SQL Server, then the most efficient way of doing that is to use the class System.Data.SqlClient.SqlBulkCopy.

There are two ways you can use SqlBulkCopy, either you give it a DataTable instance with the data represented in the same format and order as the table in the database, or you give it an IDataReader instance, that provides access to the data in the same format as the DataTable would do.

Both methods work just fine, but if you want high performance and efficiency you should not use a DataTable since it will require you to build up a DataTable object, transform your data into a row format, which is inefficient. The most efficient way is to implement an IDataReader on top of your data that you want to import. Naturally if you had to implement your IDataReader instance yourself, then the DataTable approach would probably be faster, since its very easy to understand and most people have used a DataTable before. But lets say you want to insert 1billion rows, then you face the issue that your DataTable simply cannot hold 1billion rows, so you would have to create several instances of a DataTable with chunks of data, which would use up a lot of memory anyway, and furthermore create a lot of objects that would have to be collected by the garbage collector.

By using an IDataReader you only have to provide one row at a time to the SqlbulkCopy class, and you can easily re-use your internal row representation for each instance of the row - this makes it very efficient both in terms of performance since you create less objects, and move less data into memory at the same time. Furthermore the fewer objects you create causes less garbage collection to happen, which is good, since the entire application grinds to a halt each time the garbage collector kicks in.

Now less words and more code, I have created a few classes that help with the IDataReader implementation that I have made.

 

  • FileDataColumn - A class that is used to describe the format in the record you try to load into the IDataReader.
  • FileDataRecord - An IDataRecord implementation with the possibility to also set the values of the record, not only read data from it.
  • FileDataReader - An IDataReader implementation that uses the FileRecordReader from my last post to provide forward only access to each record as an IDataRecord.

 

 

The FileDataColumn class only contains two properties. ColumnName and ColumnType, which is kind of obvious what they are used to, so I will not go into any detail on that class.

The FileDataReader takes a few arguments in its constructor that will enable it to read the data and provide a nice interface to it.

 

/// <summary>
/// Initializes a new instance of the <see cref="FileDataReader"/> class.
/// </summary>
/// <param name="fileStream">The file stream.</param>
/// <param name="columns">The columns describing the format of the stream for a single record.</param>
/// <param name="recordSeparator">The record separator.</param>
/// <param name="fieldSeparator">The field separator.</param>
/// <param name="fileEncoding">The file encoding.</param>
/// <param name="recordManipulator">The record manipulator.</param>
public FileDataReader(Stream fileStream, 
FileDataColumn[] columns, 
char recordSeparator, 
char fieldSeparator, 
Encoding fileEncoding,
Action<FileDataRecord> recordManipulator)

 

First argument is the stream where the data is located. In real world scenarios this would be a FileStream variant that would point to the file you want to read - this filestream will be passed onto the FileRecordReader instance that the constructor creates.

Second argument is an array of FileDataColumn objects that describes the record format of the file. They must be in the same order as the fields in the file.

Third argument is the record separator character, i.e. the character that separates the records from each other in the file.

Fourth argument is the field separator character, i.e. the character that separates the fields in the file.

Fifth argument is the encoding of the file, which is important in particular if you want to read text.

Last argument is an action that will be called before each call to Read returns, which will give you an opportunity to modify the data before its being passed onto whatever reads from the reader.

You use the FileDataReader as you would use any other IDataReader, by invoking the Read() Method that will return a bool indicating whether or not the reader was positioned at the next record or not.

i.e. 

 

IDataReader dataReader = new FileDataReader(s, cols, '\n', ',', Encoding.Unicode);

while (dataReader.Read())
{
    string fieldValue = (string)dataReader["field"];
    int fieldValue2 = (int)dataReader[2];
}

And so forth - the beauty of it is that if you do not want to do any processing you can just give the SqlBulkCopy the instance of the FileDataReader and you don't have to do any more work what so ever.

If you need to manipulate each record, you simply provide an Action to the FileDataReader i.e.

Stream s = new MemoryStream(1000);
for (int x = 0; x < 10; x++)
{
    AddRecordToStream(s, string.Format("{0}\n", (x * 10)));
}
s.Position = 0;
FileDataColumn[] cols = new[] 
{ 
    new FileDataColumn { ColumnName = "First", ColumnType = typeof(int) } 
};

IDataReader dataReader = new FileDataReader(
    s,
    cols,
    '\n',
    ';',
    Encoding.Unicode,
    record =>
    {

        int currentValue = record.GetInt32(0);
        record.SetValue(0, currentValue * 2);
    });

for (int x = 0; x < 10; x++)
{
    dataReader.Read();
    Assert.That(dataReader[0], Is.EqualTo(x * 10 * 2), x.ToString());

}

Nice and easy if you ask me Laughing - naturally you could easily extend and improve my FileDataReader implementation, but this will give you a hint on how you efficiently can read a file into SQL Server if you need to.

To use this reader together with SqlBulkCopy you simply create an instance of the FileDataReader and use it like below:

 

using (SqlBulkCopy bulkCopy =
                new SqlBulkCopy(destinationConnection))
{
    bulkCopy.DestinationTableName =
        "dbo.DestinationTable";

    try
    {
        bulkCopy.WriteToServer(reader);
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
    }
    finally
    {
        reader.Close();
    }
}

 

I have attached the entire source code project for both this post and the previous one, including integration tests that will show how to use the code.

I hope you enjoy using it, I certainly enjoyed writing the code.

Any questions, post a comment or leave feedback.

FileDataReader.zip (14.44 kb)

kick it on DotNetKicks.com

Reading stuctured files into SQL Server Part 1

From time to time we have all probably been tasked with getting a structured file into SQL Server.

It could be a comma separated file, it could be some other delimeter. It does not really matter. What matter is that there are several ways of getting that data into SQL Server, where some are fast and efficient and others slow and sometimes even impossible if you do it wrong.

The obvious way of importing a structured file into SQL server is to either use BCP or to use SQL servers built in BULK INSERT.

i.e. 

BULK
INSERT 
YourTableName
FROM 'c:\commaseparatedfilename.csv'
WITH
(
FIELDTERMINATOR ',',
ROWTERMINATOR '\n'
)
GO

 

BCP and the built in method is fine when you want to do a one time import, and if you like to stretch it a bit you could even do periodic importing using a maintenance task that every day at a certain time imports a specific file from a location into sql server.

But what if you need to do some processing of the file? Then you are pretty much stuck with writing a program that reads the file and writes the modified records into sql server.

That might be cumbersome task, and what if your file is several gigabytes in size. Then you cannot simply read in the entire file, since your program might run out of memory. So what you should do is simply if possible read one record from the file at a time and process the record and pass it onto sql server.

I have created a few classes to help with that, which I will present in this blog post and the ones to come.

The tasks you need to do to get that file into SQL Server is probably something like:

 

  1. Read the records out from file, one at a time, as efficiently as possible using as little memory as possible
  2. Parse each record into its different columns resulting in a strongly typed object that can be pased onto SQL Server easily.
  3. Optionally parse each record and its values before its being passed onto SQL Server for storing.

 

I will present a nice solution to task #1 in this first blog post, and will present a solution to #2 in the next blog post.

For task #1 I have created a nice little class that I call FileRecordReader, which basically have a single method called ReadNextRecord.

 

The method ReadNextRecord will read the next record and return that as a string and advance its internal positions to the location of the next record in the file.

 

/// <summary>
/// Reads the next record from the stream.
/// </summary>
/// <returns>The next record from the stream or null if no more records exist.</returns>
public string ReadNextRecord()
{

 

The FileRecordReader class takes a few arguments in its constructor that will help it read the file and understand where a record starts and stops.

 

/// <summary>
/// Initializes a new instance of the <see cref="FileRecordReader"/> class.
/// </summary>
/// <param name="fileStream">The file stream.</param>
/// <param name="recordSeparator">The record separator.</param>
/// <param name="fileEncoding">The file encoding.</param>
public FileRecordReader(Stream fileStream, char recordSeparator, Encoding fileEncoding)
{

 

First argument is the stream where the reader should read its data from, which in real life usages should be a FileStream instance.

Second argument is a char that will be used to separate the records from each other - normal use cases would be a newline character \n, but this class supports any arbitrary character that you would like to use, in the case your records contains linebreaks that you would like to retain in the imported data.

Last parameter is the encoding of the file. This is also very important since a UTF-8 or -16 encoded file that gets read using your standard encoding in windows will not look pretty since they will be parsed incorrectly.

If you look at the source code attached you might see that it has a similar way of working as the built in StreamReader class and the method ReadLine - but if you have a different record separator you cannot use StreamReader but have to parse the file yourself.

I have attached the source code to the FileRecordReader class and also a simple test that tests that the class is working.

FileRecordReader.cs (5.33 kb)

FileRecordReaderTests.cs (2.86 kb)

 

Stay tuned for the next post where I will describe how you can use the above FileRecordReader to present a nice interface that makes it easy to get those records into SQL Server.

kick it on DotNetKicks.com

Updates to the asyncronous memcached client

This evening I updated the asyncronous memcached client that I am building.

I implemented basic performance counters, so the client will populate performance counters for the following items:

 

  • Average wait time for a free client socket
  • Total cache operations per second
  • Adds per second
  • Appends per second
  • CAS per second
  • Deletes per second
  • Prepends per second
  • Replaces per second
  • Sets per second
  • Errors per second

 

The performance counters also gave me a nice way to assess the performance of the client and with a soak test program running full throttle, only sleeping one ms between interations and using 50 threads and 25 socket, the code is capable of doing almost 20.000 add operations per second against one memcached server.

That's pretty awesome if you ask me :)

Check it out at http://asyncmemcached.codeplex.com - and please do add comments (if anyone is reading this at all )

Asyncronous memcached client

I have been working with distributed caching for about 4 years now, using memcached as the only server.

I have been trying out different memcached clients, and some have been good, others bad.

They have all had the same problem: They have been syncronous implemented, i.e. they have been wasting a lot of theads on simple waits.

I have started a project to create a fully asyncronous memcached client in .NET.

Check out:

http://asyncmemcached.codeplex.com/

Its not production code yet, but its a fully working client, for gets/sets. It just needs some additional features, and I will release a version.

Error 0x80005000 when using Directory Services in .NET

I am currently developing a deployment tool to help me do easy deployments of websites to many web servers at the same time.

To do this I am using a combination of WMI and Directory services in .NET.

When I tried out the tool on our production environment I got some COM exceptions.

Naturally I started looking at my code, trying different approaches, but to no avail.

I then later found our that to use the directory services together with IIS, you need to have IIS installed on the machine you are running the code from.

Even if you do not manipulate the local machine.

So e.g. a directory URL called IIS://machinename/W3SVC will not work, unless you install IIS on the local machine from where you run the code.

If IIS is not installed you will get an error like:

 System.Exception: System.Runtime.InteropServices.COMException (0x80005000): Unknown error (0x80005000)
   at System.DirectoryServices.DirectoryEntry.Bind(Boolean throwIfFail)
   at System.DirectoryServices.DirectoryEntry.Bind()


The code I was using was:

Example:
string issMetaBasePath = "IIS://Server/W3SVC";
using (DirectoryEntry dir = new DirectoryEntry(iisMetaBasePath))
            {
                foreach (DirectoryEntry de in dir.Children)
                {

 

 

Subversion repository upgrade using svnadmin dump

I have just done a subversion repository upgrade from version 1.4.2 to version 1.5.4.

Its rather simple, just install the new version of the subversion binaries, and run the following command:

svnadmin upgrade <path to repository>

The only problem with running svnadmin upgrade is that it's not guaranteed to make all new features available in the repository, and you will not benefit from any storage engine changes.

So after trying out the svn upgrade, I decided to do a full svn dump/load.

Beware :)

svnadmin dump <path to repository> > <path to file>  - will dump the entire repository to stdout+ the file specified, what this means is that if you run on a windows machine, as we do, it will render the machine unsuable for the entire duration of the dump, since for some reason printing to stdout kills a windows machine if its huge amounts of data (our repository is 1.5 gigs).

So I found the nice little switch you can use for dump --quiet, which will dump it to the file, but not stdout. This will make everything work faster, and your machine will not lock up, so do this unless you want pain.

The same goes for svnadmin load - use --quiet otherwise svnadmin will print the dumpfile before creating the repository, resulting in the same pain as the dump.

This was just my insight, use it or don't :)

I will for sure next time I need to upgrade another repository.

Regular Expressions in .NET are cool

We have all been faced with the problem of finding a specific part of a larger string, and use that particular part of the string for further processing.
It can be pattern matching on lots of lines for when you perhaps have to load a text file into an object structure or it can be very simple, find a particular range of text in another string.

The obvious way to find those bits of strings is to use string.IndexOf and string.Substring. Those methods on the string class is nice and very fast if its a simple string, and if you know exactly how the string is formattet, but as soon as the string you are searching in gets a little more complicated you end up writing a lot of lines of code that easily becomes error prone, and can be hard to extend unless you really do things correct from the begining.

In some situations string.IndexOf and string.Substring is the correct choice, but I will show you with this blog post that using regular expressions you can end up with code that is having much less lines, is easier to understand, and very easy to extend and even in some situations faster.

Admittedly regular expressions have a steep learning curve, but when you get into it, you will be amazed what you can do with regular expressions.

I have created some tasks that I will solve in this blog post, I will solve them using regular expressions and also with the more traditional approach using string.IndexOf and string.Substring. The reason for doing both is to show you the difference in the lines of code, the readability and not least the performance difference of the two solutions.

The tasks are:
  • Create parser code that can parse a VCARD string into a simple object structure.
  • Create code that can extract the telephone number from a VCARD string
The first task is to create a parser that can parse a VCARD into an object. I have created the VCARD below, which will be used as an example VCARD.

BEGIN:VCARD

FN;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:Bj=C3=B8rn Bouet Smith

TEL:+4512345678

X-IRMC-URL:http://blog.smithfamily.dk

NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:This is a note=

 With multiple lines=

 Of text

END:VCARD


You might not be familar with the VCARD format, but its nothing more than a string representation of a Contact. You can read more about VCARD at this website.

The VCARD format is being used all over. Most if not all mobilephones use the VCARD format when syncronizing their contacts to a server or to outlook for that matter.

Anyway onto the code. I have created a Contact class that will be the resulting object that the parser will create based on the string representation. The class is like the following:

/// <summary>

/// Sample object that contains 4 specific fields and a collection of unspecified fields

/// </summary>

public class Contact

{

    public Contact()

    {

        OtherFields = new List<KeyValuePair<string, string>>();

    }

 

    /// <summary>

    /// Gets or sets the name.

    /// </summary>

    /// <value>The name.</value>

    public string Name

    {

        get;

        set;

    }

 

    /// <summary>

    /// Gets or sets the telephone.

    /// </summary>

    /// <value>The telephone.</value>

    public string Telephone

    {

        get;

        set;

    }

    /// <summary>

    /// Gets or sets the note.

    /// </summary>

    /// <value>The note.</value>

    public string Note

    {

        get;

        set;

    }

 

    /// <summary>

    /// Gets or sets the URL.

    /// </summary>

    /// <value>The URL.</value>

    public string Url

    {

        get;

        set;

    }

 

    /// <summary>

    /// Gets or sets the other fields this contact contains

    /// </summary>

    /// <value>The other fields.</value>

    public List<KeyValuePair<string,string>> OtherFields

    {

        get;

        set;

    }

 

 

}

So basically all we have to do is parse that string into this simple object, sound pretty easy right :)

Well naturally its not rocket science, but its not that easy when you look at the VCARD, its not enought to split by line and do a parsing line by line and you are done. The reason for this is that the value of a field can span multiple lines. So our parse have to take that into account, and naturally parse all unknown fields into the OtherFields property of the contact.

I have made an assumption in my code, that is that you have a QuotedPrintable decoder available.

I have made a few methods that both examples will use, which assumes you have the QuotedPrintable decoder available. The methods are as follows:

/// <summary>

/// Gets the field contents doing any neccesary quoted printable decoding

/// </summary>

/// <param name="contents">The contents.</param>

/// <param name="charsetStr">The charset as a string</param>

/// <param name="encodingStr">The encoding as a string</param>

/// <returns>The decoded contents or the original contents if no decoding is neccesary</returns>

private string GetFieldContents(string contents, string charsetStr, string encodingStr)

{

    bool mustDecode = !string.IsNullOrEmpty(encodingStr);

    bool haveCharset = !string.IsNullOrEmpty(charsetStr);

 

    if (mustDecode)

    {

        if (haveCharset)

        {

            return DecodeQuotedPrintable(contents, Encoding.GetEncoding(charsetStr));

        }

        else

        {

            return DecodeQuotedPrintable(contents);

        }

    }

    return contents;

 

}

/// <summary>

/// Decodes the quoted printable string

/// </summary>

/// <param name="contents">The contents.</param>

/// <param name="encoding">The encoding.</param>

/// <returns></returns>

public string DecodeQuotedPrintable(string contents, Encoding encoding)

{

    //Assumes that you have a method that can decode quoted printable taking encoding into account

    //There is plenty of free code available on the net, and to include one here would be out of

    //scope for this blog post

    return contents;

}

/// <summary>

/// Decodes the quoted printable string

/// </summary>

/// <param name="contents">The contents.</param>

/// <returns></returns>

public string DecodeQuotedPrintable(string contents)

{

    //Assumes that you have a method that can decode quoted printable using system default encoding

    //There is plenty of free code available on the net, and to include one here would be out of

    //scope for this blog post

    return contents;

}


Basically what these methods does is that they decode the contents if needed, otherwise they just return the string. This is where my code is missing the QuotedPrintable decoder, the reason for this is that to include a complete QuotedPrintable decoder in this blog post would be completely out of scope, and would move focus from whats important :)

The first thing you need to do when using Regular Expressions is to create your Regex object. To this particular job I have created the following regular expression:

private static readonly Regex rStatic = new Regex(@"^(?<FIELDNAME>[\w-]{1,})

(?:(?:;?)(?:ENCODING=(?<ENC>[^:;]*)|CHARSET=(?<CHARSET>[^:;]*))){0,2}

:(?:(?<CONTENT>(?:[^\r\n]*=\r\n){1,}[^\r\n]*)|(?<CONTENT>[^\r\n]*))",

    RegexOptions.ExplicitCapture |

    RegexOptions.IgnoreCase |

    RegexOptions.IgnorePatternWhitespace |

    RegexOptions.Multiline | RegexOptions.Compiled);


I will explain each part of the regular expressions, but I will not delve into all the details on how to create regular expressions, the syntax and so forth, for that you should consult the .NET documentation, and perhaps use one of the tools out there that can help you create and test regular expressions. I use a tool called Expresso from the company Ultrapico. I have used it since 2003, and its extremely good. You can download it from this website.

The Regex object I have created contains two parts, the regular expression and some options. The option RegexOptions.Compiled must be used with care. The reason for this is that if you specify that option the .NET framework will compile an assembly each time the Regex object is created, which will cause memory leaks unless you create the Regex object as a static class variable, which I have done in the above example.

Regex Description
^ The first part of the regular expression, simply states that it should start matching at the beginning of the line
(?<FIELDNAME>[\w-]{1,}) This part means capture any number and any character into the capture group FIELDNAME. And require at least one character
(?:(?:;?)(?:ENCODING=(?<ENC>[^:;]*)|CHARSET=(?<CHARSET>[^:;]*))){0,2} This part is built up of first having a non capturing group, thats indicated by the ?: The reason for having the non capturing group in this example is that I want to require up to 2 instances of the entire regex part, and to do that I have to enclose the entire section in a non capturing group.
First element in the non capturing group is (?:;?), which is another non capturing group, that indicates that the character ; might be available. You specify "might be available" with ?, which is similar as specifying {0,1}, but since ? is shorter, you should use that.
The next element is a non capturing group that must contain the the ENCODING=, and also a capturing group called ENC which will capture all characters except ; and : or contain the text CHARSET= and then a capturing group called CHARSET, which will also capture all characters but ; and :. The entire regex part should be available up to two times, but might not be available at all indicated by {0,2}
:(?:(?<CONTENT>(?:[^\r\n]*=\r\n){1,}[^\r\n]*)|(?<CONTENT>[^\r\n]*)) The last part of the regex contains two alternatives that is wrapped in a non capturing group.
First alternative is a capturing group called CONTENT that will capture all characters that matches the pattern, where all lines end in a = sign and all characters on the following line. The pattern should be matched at least one time, but there is no upper limit.
The other alternative is a capturing group also called CONTENT, but which only matches content that end on a single line, and all characters but the line break characters \r\n

Now that the Regex is in place and explained, why don't I show you the code that is going to use the regular expression to solve the task that we created above

/// <summary>

/// Returns a contact, parsing the VCARD string using regular expressions

/// </summary>

/// <param name="contents">The contents.</param>

/// <returns></returns>

public Contact GetContactRegex(string contents)

{

    //Create new instance of a Contact

    Contact contact = new Contact();

 

    //Match the contents with the regular expression

    MatchCollection matches = rStatic.Matches(contents);

 

    //Iterate over each match

    foreach (Match match in matches)

    {

        //Assign values from the match group we created in the regular expressions

        string fieldName = match.Groups["FIELDNAME"].Value;

        string fieldValue = match.Groups["CONTENT"].Value;

        string charSetStr = match.Groups["CHARSET"].Value;

        string encodingStr = match.Groups["ENC"].Value;

 

        //Assign values to the contact object from the values of the capture groups

        switch (fieldName)

        {

            case "FN":

                //name

                contact.Name = GetFieldContents(fieldValue, charSetStr, encodingStr);

                break;

            case "TEL":

                //telephone

                contact.Telephone = GetFieldContents(fieldValue, charSetStr, encodingStr);

                break;

            case "X-IRMC-URL":

                //url

                contact.Url = GetFieldContents(fieldValue, charSetStr, encodingStr);

                break;

            case "NOTE":

                contact.Note = GetFieldContents(fieldValue, charSetStr, encodingStr);

                break;

            default:

                //All other fields just add them to the other fields collection

                contact.OtherFields.Add(new KeyValuePair<string, string>(fieldName, GetFieldContents(fieldValue, charSetStr, encodingStr)));

                break;

        }

 

    }

    return contact;

}


See the c# code is very easily read, and if the logic of the VCARD changes all you have to do is change the regular expression, and you don't have to change the logic of the c# code.

Lets move onto solving the same task using c# code only, i.e. no regular expressions, but using the same support methods, i.e. GetFieldContents.

I have created the following solution, which might not be perfect, but gets the job done.

I have created a class that will represent a single property in the VCARD:

/// <summary>

/// Class that represents a single property in the VCARD

/// </summary>

public class Line

{

    /// <summary>

    /// Gets or sets the name of the field.

    /// </summary>

    /// <value>The name of the field.</value>

    public string FieldName

    {

        get;

        set;

    }

 

    /// <summary>

    /// Gets or sets the charset.

    /// </summary>

    /// <value>The charset.</value>

    public string Charset

    {

        get;

        set;

    }

 

    /// <summary>

    /// Gets or sets the encoding.

    /// </summary>

    /// <value>The encoding.</value>

    public string Encoding

    {

        get;

        set;

    }

 

    /// <summary>

    /// Gets or sets the contents.

    /// </summary>

    /// <value>The contents.</value>

    public string Contents

    {

        get;

        set;

    }

}


And two methods that allows me to parse a single property line represented as a string into a Line object

/// <summary>

/// Parses a string into a Line object

/// </summary>

/// <param name="lineString">The line string.</param>

/// <returns></returns>

private Line GetLine(string lineString)

{

    Line line = new Line();

 

    if (lineString.Contains("CHARSET="))

    {

        line.Charset = GetParameterValue(lineString, "CHARSET=");

    }

    if (lineString.Contains("ENCODING="))

    {

        line.Encoding = GetParameterValue(lineString, "ENCODING=");

    }

    int firstSeperator = lineString.IndexOfAny(new char[] { ';', ':' });

 

    line.FieldName = lineString.Substring(0, firstSeperator);

 

 

    int contentStart = lineString.IndexOf(":") + 1;

    line.Contents = lineString.Substring(contentStart).Trim();

 

 

    return line;

}

 

/// <summary>

/// Gets the parameter value

/// </summary>

/// <param name="contents">The contents.</param>

/// <param name="parameter">The parameter.</param>

/// <returns></returns>

private string GetParameterValue(string contents, string parameter)

{

    int paramStart = contents.IndexOf(parameter) + parameter.Length;

    if (paramStart == parameter.Length - 1)

    {

        //Not found

        return null;

    }

    int paramEnd = contents.IndexOfAny(new char[] { ';', ':' }, paramStart);

    if (paramEnd == -1)

    {

        //Not found, so return the rest of the string

        return contents.Substring(paramStart);

    }

    return contents.Substring(paramStart, paramEnd - paramStart);

}


The method that returns the Contact based on the VCARD string is as follows:

/// <summary>

/// Returns a contact object using no regular expressions

/// </summary>

/// <param name="contents">The contents.</param>

/// <returns></returns>

public Contact GetContactRegular(string contents)

{

    //Create new instance of a contact

    Contact contact = new Contact();

 

    //Split all lines into a string array

    string[] lines = contents.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries);

 

    //Create a string build that will hold each property of the VCARD as we built them from the lines array

    StringBuilder currentLine = new StringBuilder(100);

 

    //bool value to indicate whether not the current line belongs together with the next line

    bool addNextLine = false;

    //Create Collections of Line objects

    List<Line> allLines = new List<Line>();

 

    //Interate over each string in the lines array, parsing them into a Line object

    foreach (string line in lines)

    {

        //Check whether or not the current line belongs together with the next one

        addNextLine = line.EndsWith("=");

 

        currentLine.AppendLine(line);

        if (!addNextLine)

        {

            //If line does not belong together with the next one, Parse the string into a Line object

            allLines.Add(GetLine(currentLine.ToString()));

            currentLine = new StringBuilder(100);

        }

 

    }

 

    foreach (Line l in allLines)

    {

        switch (l.FieldName)

        {

            case "FN":

                //name

                contact.Name = GetFieldContents(l.Contents, l.Charset, l.Encoding);

                break;

            case "TEL":

                //telephone

                contact.Telephone = GetFieldContents(l.Contents, l.Charset, l.Encoding);

                break;

            case "X-IRMC-URL":

                //url

                contact.Url = GetFieldContents(l.Contents, l.Charset, l.Encoding);

                break;

            case "NOTE":

                contact.Note = GetFieldContents(l.Contents, l.Charset, l.Encoding);

                break;

            default:

                //All other fields just add them to the other fields collection

                contact.OtherFields.Add(new KeyValuePair<string, string>(l.FieldName, GetFieldContents(l.Contents, l.Charset, l.Encoding)));

                break;

 

        }

    }

 

    return contact;

}



Okay first impression: Thats a whole lot of code to get to the same result as the method that was using regular expressions. In comparison, the regular expressions code consists of only 43 lines of code plus the regular expression, and the alternative code that is not using regular expressions is a whopping 138 lines of code. Thats more than three times the amount of code that might contain bugs, that need proper unit testing etc. So if you ask me, I would prefer the 43 lines of code to maintain, instead of the 138 lines of code :)

Regular expressions can be faster runtime than regular c# code, but in this example its not. I have run the two Methods 1 million times, and the numbers is as the following:


Method Milliseconds
GetContactRegular
24625
GetContactRegex
45546,875
GetContactRegex (No RegexOptions.Compiled)
105453,125

As can clearly see, the traditional way of doing things is way faster in this particular example, almost twice as fast as the regular expressions solution. That shouldnt make you say, then I will not use the Regex solution, since sometimes easily maintainable and easily extensible code is worth more than speed. You can also see that by using RegexOptions.Compiled we gain more than twice the speed over not using the Compiled version of the regular expression, so when ever you use regular expressions, consider using the RegexOptions.Compiled since it will give you increased speed. Just remember to create the Regex static so you don't leak memory.

Okay this task shows us clearly that regular expressions can give you code that is easier to read, less code to manage, and easier to extend, but this particular example fails comparing speed.

Lets move onto the next task: Returning the Telephone number from the VCARD only. We will be using the same VCARD to test:

BEGIN:VCARD

FN;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:Bj=C3=B8rn Bouet Smith

TEL:+4512345678

X-IRMC-URL:http://blog.smithfamily.dk

NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:This is a note=

 With multiple lines=

 Of text

END:VCARD


Lets start with the Regex solution, to that purpose I have created the following Regex:

private static readonly Regex rSimple = new Regex("^(?:TEL):(?<TEL>[^\r\n]*)",

    RegexOptions.ExplicitCapture |

    RegexOptions.IgnoreCase |

    RegexOptions.IgnorePatternWhitespace |

    RegexOptions.Multiline | RegexOptions.Compiled);


This regular expression is very simple, and simply graps the text after the : sign up to the end of the line for all lines that begin with TEL.

The accompanying c# code is:

string tel = rSimple.Match(contact).Groups["TEL"].Value;


Thats simple :) - you cannot get that easily with c# code only.

Lets do the c# solution as well:

int indexStart = contact.IndexOf("TEL:")+4;

int indexEnd = contact.IndexOf("\r\n", indexStart);

string tel = contact.Substring(indexStart, indexEnd - indexStart);



Again its not bad, three lines of code and you have the telephone number. But what happens if the TEL property line was not that simple. What if it was like:

TEL;CELL;HOME:+4512345678


Then we would have to revise our c# code to take into account that it had to skip the parameters if available. By adding just a few characters to our regular expression we can have it take into account the optional parameters. If we add the following to our regular expression: ([^:]*) and end up with a regular expression like:

^(?:TEL)([^:]*):(?<TEL>[^\r\n]*)


Then our regular expression still do the job, and we dont have to change the accompanying c# code at all. In contrast to tweak our c# code to handle parameters we would need to find the first index of the : character and then do a substring from that point. Nothing hard, but more error prone.

Speed, well actually in this case where the regular expression is so simple, its much faster than the c# code.

I have again run the same code 1 million times and the results is:

Method Milliseconds
c# code
4125
regular expression
2984,375

So you see this time, the regular expression wins the speed test, and also the extensibility and maintainability test if you ask me :)

The lesson to be learned from this blog post is that regular expressions is your friend, and can help you make code faster in a lot less lines of c# code. In some situations when the regular expressions is very simple they even provide you with a decent performance gain.

When doing string searches, regular expressions can help you immensely, and they are not at all that dangerous or hard to learn as many people think.

Another bonus with regular expressions is that you can to very advanced pattern matching and replacement, i.e. lets say you wanted to reformat all phone numbers in the VCARDS to a particular format, that would be possible using the same regular expression as the above. Lets say you wanted to add a prefix to all the phone numbers, like to dial an outside line, i.e. 0, then it would be possible with the following line of code.

rSimple.Replace(contact, "TEL:0,${TEL}")


Which simply states that using the same regular expression, replace the contents of the match match with the contents of the TEL group and put 0, infront of it :)

Simple right :)

I hope if you haven't even used regular expressions before that you will consider it now or even if you have used regular expressions before that I have given you further reason for doing it :)

Implementing Basic Authentication in ASP.NET 2.0

I have many times wanted to implement basic authentication in asp.net applications, but has been unwilling to use the built in basic authentication of IIS, since I think its a bother to use either the Windows machine's users since its a pain to administer and does not easily tie into an existing solution of user authentication you have, like a CMS or what have you. People would proably argue that I should use forms based authentication, but thats not always possible, like if you want to have your services accessible from lets say a program that runs on another machine, i.e. a service that polls for data. That program need Basic authentication or Digest authentication, since thats much more easy to implement when you dont have a browser as the client.

You could use the built in WindowsAuthentication http module of asp.net, but then you are stuck with using the windows users, and manage roles for them in windows as well and that kind of sucks, since you would proably want to use your application's user administration to manage access to your application.

So what you want to do is create your own HttpModule that provides Basic Authentication functionality.

It sounds harder than it actually is, and I have created a complete package of files you can copy/paste and implement a little code yourself, and then you have a ready to plugin Basic authentication module.

What I have created is a HttpModule that takes care of the Basic Authentication, then I have created a couple of interfaces that needs implementing. The implementations of the interfaces will provide answers to the Basic Authentication HttpModule about whether or not a given user is a valid user, and whether or not the user is allowed to see a given page or do a given request.

The code for the interfaces is pretty simple, which interfaces usually is since its the implementation that does all the work :)

Interfaces needed:
using System;
using System.Collections.Generic;
using System.Text;
using System.Web;
using System.Security.Principal;

namespace Smithfamily.Blog.Samples
{
    /// <summary>
    /// An authentication and authorization provider for very simple applications
    /// Should probably be either implemented with a database backend, 
    /// or using a web.config custom section
    /// Implementors of this interface should provide a default no args constructor to be used
    /// by the AuthenticationModule
    /// </summary>
    public interface IAuthProvider : IDisposable
    {
        /// <summary>
        /// Validates the username and password and returns whether or not the combination is a valid user
        /// </summary>
        /// <param name="userName">The username to validate</param>
        /// <param name="password">The password to match</param>
        /// <param name="user">The user object created</param>
        /// <returns>true if the combination is a valid user;false otherwise</returns>
        bool IsValidUser(string userName, string password, out IBasicUser user);

        /// <summary>
        /// Determines whether or not the current request is allowed to continue for the given user
        /// </summary>
        /// <param name="request">The request to check</param>
        /// <param name="user">The user</param>
        /// <returns>true if request is authorized;false otherwise</returns>
        bool IsRequestAllowed(HttpRequest request, IBasicUser user);

    }

    /// <summary>
    /// interface for a very simple user object that contains the bare 
    /// minimum to do authentication against a real backend
    /// </summary>
    public interface IBasicUser : IIdentity
    {
        /// <summary>
        /// Gets or sets the username of the user.
        /// </summary>
        /// <value>The username of the user.</value>
        string UserName
        {
            get;
            set;
        }
        /// <summary>
        /// Gets or sets the password.
        /// </summary>
        /// <value>The password.</value>
        string Password
        {
            get;
            set;
        }
    }
}


The IAuthProvider is the interface for the class you need to implement that will do lookup in your backend for users, and will validate whether or not a user have access to a give resource. The IBasicUser is an interface for a very simple user object that can contain the bare minimum to authenticate a user. I have implemented IBasicUser and will pass an implementation of that to the configured IAuthProvider.

I have also made a silly implementation of the IAuthProvider that will accept any users for logon, but will only authorize a user with the username bjorn, i.e. anyone can log on, but only I am allowed to do anything. The implementation is just as an example, please don't implement your versions like this, but rather read the users from a database, and validate their password properly.

The HttpModule itself is pretty simple as well.

Basic Authentication Module:

using System;

using System.Collections.Generic;

using System.Text;

using System.Web;

using System.Security.Principal;

using System.Configuration;

using System.Reflection;

 

namespace Smithfamily.Blog.Samples

{

    /// <summary>

    /// HttpModule that provides Basic authentication for asp.net applications

    /// </summary>

    public class BasicAuthenticationModule : IHttpModule

    {

        private static IAuthProvider authProvider;

 

        #region IHttpModule Members

 

        /// <summary>

        /// Initializes the <see cref="BasicAuthenticationModule"/> class.

        /// Instantiates the IAuthProvider configured in the web.config

        /// </summary>

        static BasicAuthenticationModule()

        {

            string provider =

                ConfigurationManager.AppSettings["Smithfamily.Blog.Samples.BasicAuthenticationModule.AuthProvider"];

            Type providerType = Type.GetType(provider, true);

            authProvider = Activator.CreateInstance(providerType, false) as IAuthProvider;

        }

 

        /// <summary>

        /// Disposes of the resources (other than memory) used by the module that implements <see cref="T:System.Web.IHttpModule"/>.

        /// </summary>

        public void Dispose()

        {

            authProvider.Dispose();

            authProvider = null;

        }

 

        /// <summary>

        /// Initializes a module and prepares it to handle requests.

        /// </summary>

        /// <param name="context">An <see cref="T:System.Web.HttpApplication"/> that provides access to the methods, properties, and events common to all application objects within an ASP.NET application</param>

        public void Init(HttpApplication context)

        {

            context.AuthenticateRequest += new EventHandler(context_AuthenticateRequest);

            context.AuthorizeRequest += new EventHandler(context_AuthorizeRequest);

            context.BeginRequest += new EventHandler(context_BeginRequest);

 

        }

 

        /// <summary>

        /// Handles the BeginRequest event of the context control.

        /// </summary>

        /// <param name="sender">The source of the event.</param>

        /// <param name="e">The <see cref="System.EventArgs"/> instance containing the event data.</param>

        void context_BeginRequest(object sender, EventArgs e)

        {

 

            HttpApplication context = sender as HttpApplication;

            if (context.User == null)

            {

                if (!TryAuthenticate(context))

                {

                    SendAuthHeader(context);

                    return;

                }

 

            }

            BasicUser bu = context.User.Identity as BasicUser;

            context.Response.Write(string.Format("Welcome {0} with the password:{1}", bu.UserName, bu.Password));

        }

 

        /// <summary>

        /// Sends the Unauthorized header to the user, telling the user to provide a valid username and password

        /// </summary>

        /// <param name="context">The context.</param>

        private void SendAuthHeader(HttpApplication context)

        {

            context.Response.Clear();

            context.Response.StatusCode = 401;

            context.Response.StatusDescription = "Unauthorized";

            context.Response.AddHeader("WWW-Authenticate", "Basic realm=\"Secure Area\"");

            context.Response.Write("401 baby, please authenticate");

            context.Response.End();

        }

 

 

        /// <summary>

        /// Handles the AuthorizeRequest event of the context control.

        /// </summary>

        /// <param name="sender">The source of the event.</param>

        /// <param name="e">The <see cref="System.EventArgs"/> instance containing the event data.</param>

        void context_AuthorizeRequest(object sender, EventArgs e)

        {

            HttpApplication context = sender as HttpApplication;

 

            BasicUser bu = context.User.Identity as BasicUser;

            if (!authProvider.IsRequestAllowed(context.Request, bu))

            {

                SendNotAuthorized(context);

            }

        }

        /// <summary>

        /// Sends the not authorized headers to the user

        /// </summary>

        /// <param name="context">The context.</param>

        private void SendNotAuthorized(HttpApplication context)

        {

            context.Response.Clear();

            context.Response.StatusCode = 403;

            context.Response.StatusDescription = "Forbidden";

            context.Response.Write("403 baby, You are not allowed to see this");

            context.Response.End();

        }

 

        /// <summary>

        /// Tries to authenticate the user

        /// </summary>

        /// <param name="context">The context.</param>

        /// <returns></returns>

        private bool TryAuthenticate(HttpApplication context)

        {

            string authHeader = context.Request.Headers["Authorization"];

            if (!string.IsNullOrEmpty(authHeader))

            {

                if (authHeader.StartsWith("basic ", StringComparison.InvariantCultureIgnoreCase))

                {

 

                    string userNameAndPassword = Encoding.Default.GetString(

                        Convert.FromBase64String(authHeader.Substring(6)));

                    string[] parts = userNameAndPassword.Split(':');

                    IBasicUser bu = null;

                    if (authProvider.IsValidUser(parts[0], parts[1], out bu))

                    {

                        context.Context.User = new GenericPrincipal(bu, new string[] { });

                        if (!authProvider.IsRequestAllowed(context.Request, bu))

                        {

                            SendNotAuthorized(context);

                            return false;

                        }

                        return true;

                    }

 

                }

 

            }

            return false;

        }

 

        /// <summary>

        /// Handles the AuthenticateRequest event of the context control.

        /// </summary>

        /// <param name="sender">The source of the event.</param>

        /// <param name="e">The <see cref="System.EventArgs"/> instance containing the event data.</param>

        void context_AuthenticateRequest(object sender, EventArgs e)

        {

            HttpApplication context = sender as HttpApplication;

            TryAuthenticate(context);

 

        }

 

        #endregion

    }

 

    /// <summary>

    /// Sample IAuthProvider that will authenticate all users, and only allow access to user with a username of bjorn

    /// </summary>

    public class BasicAuthProvider : IAuthProvider

    {

 

        #region IAuthProvider Members

 

        /// <summary>

        /// Validates the username and password and returns whether or not the combination is a valid user

        /// </summary>

        /// <param name="userName">The username to validate</param>

        /// <param name="password">The password to match</param>

        /// <param name="user">The user object created</param>

        /// <returns>

        /// true if the combination is a valid user;false otherwise

        /// </returns>

        public bool IsValidUser(string userName, string password, out IBasicUser user)

        {

            user = new BasicUser();

            user.UserName = userName;

            user.Password = password;

 

            return true;

        }

 

        /// <summary>

        /// Determines whether or not the current request is allowed to continue for the given user

        /// </summary>

        /// <param name="request">The request to check</param>

        /// <param name="user">The user</param>

        /// <returns>

        /// true if request is authorized;false otherwise

        /// </returns>

        public bool IsRequestAllowed(HttpRequest request, IBasicUser user)

        {

            return user.UserName == "bjorn";

        }

 

 

        /// <summary>

        /// Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.

        /// </summary>

        public void Dispose()

        {

            //This is intentional, since we don't have any resources to free in this very simple sample IAuthProvider

        }

 

        #endregion

    }

 

    public class BasicUser : IBasicUser

    {

        /// <summary>

        /// Gets or sets the username of the user

        /// </summary>

        /// <value>The username of the user.</value>

        public string UserName

        {

            get;

            set;

        }

        /// <summary>

        /// Gets or sets the password.

        /// </summary>

        /// <value>The password.</value>

        public string Password

        {

            get;

            set;

        }

 

        #region IIdentity Members

 

        /// <summary>

        /// Gets the type of authentication used.

        /// </summary>

        /// <value></value>

        /// <returns>

        /// The type of authentication used to identify the user.

        /// </returns>

        public string AuthenticationType

        {

            get

            {

                return "Custom";

            }

        }

 

        /// <summary>

        /// Gets a value that indicates whether the user has been authenticated.

        /// </summary>

        /// <value></value>

        /// <returns>true if the user was authenticated; otherwise, false.

        /// </returns>

        public bool IsAuthenticated

        {

            get

            {

                return UserName != null;

            }

        }

 

        /// <summary>

        /// Gets the name of the current user.

        /// </summary>

        /// <value></value>

        /// <returns>

        /// The name of the user on whose behalf the code is running.

        /// </returns>

        public string Name

        {

            get

            {

                return UserName;

            }

        }

 

        #endregion

    }

}


Let me go throught the code in the BasicAuthenticationModule step by step so you understand what is happening.

Initialization:
/// <summary>
/// Initializes the <see cref="BasicAuthenticationModule"/> class.
/// Instantiates the IAuthProvider configured in the web.config
/// </summary>
static BasicAuthenticationModule()
{
    string provider = 
ConfigurationManager.AppSettings["Smithfamily.Blog.Samples.BasicAuthenticationModule.AuthProvider"]; Type providerType = Type.GetType(provider, true); authProvider = Activator.CreateInstance(providerType, false) as IAuthProvider; }


These lines of codes configures the authentication module, and is done only once per application restart.

What these lines do is that they look in the Web.config for a appSettings parameter called Smithfamily.Blog.Samples.BasicAuthenticationModule.AuthProvider, and tries to create an instance of the fully qualified type name and use it as its implementation of IAuthProvider. This web.config parameter is where you configure the basic authentication module to use your IAuthProvider implementation.
i.e.

Web.config configuration of BasicAuthenticationModule:
<appSettings>
    <add key="Smithfamily.Blog.Samples.BasicAuthenticationModule.AuthProvider" 
value="Smithfamily.Blog.Samples.BasicAuthProvider"/> </appSettings>



These lines of codes simply calls Dispose on the implementation of the IAuthProvider just in case there is some resources that need to be disposed.

Dispose:
/// <summary>
/// Disposes of the resources (other than memory) used by the module that 
/// implements <see cref="T:System.Web.IHttpModule"/>.
/// </summary> public void Dispose() { authProvider.Dispose(); authProvider = null; }


The following lines of codes is simply telling the application that the module wants to be part of the following events:

AuthenticateRequest, AuthorizeRequest and BeginRequest.

The reason why we need hooks on these events is that this is where we will do our magic, so without registering event handlers on these events there will be no basic authentication

Init:
/// <summary>
/// Initializes a module and prepares it to handle requests.
/// </summary>
/// <param name="context">An <see cref="T:System.Web.HttpApplication"/> that provides access to the methods, 
/// properties, and events common to all application objects within an ASP.NET application</param>
public void Init(HttpApplication context) { context.AuthenticateRequest += new EventHandler(context_AuthenticateRequest); context.AuthorizeRequest += new EventHandler(context_AuthorizeRequest); context.BeginRequest += new EventHandler(context_BeginRequest); }


The following event handler gets called each time a request begins on the server, and this is the perfect place to try to authenticate the user, which is what we do.

We call the method TryAuthenticate and if we get a false back from that method, we send the needed authentication headers to the user and returns.

If we the TryAuthenticate method returns true, then we just let everything flow, but injects some silly text on top of the page.
The last two lines of the method you need to remove when using this module, otherwise all pages will contain the text:
Welcome user with the password: xxxx, which is kind of not cool :)

context_BeginRequest:
/// <summary>
/// Handles the BeginRequest event of the context control.
/// </summary>
/// <param name="sender">The source of the event.</param>
/// <param name="e">The <see cref="System.EventArgs"/> 
/// instance containing the event data.</param>
void context_BeginRequest(object sender, EventArgs e) { HttpApplication context = sender as HttpApplication; if (context.User == null) { if (!TryAuthenticate(context)) { SendAuthHeader(context); return; } } BasicUser bu = context.User.Identity as BasicUser; context.Response.Write(
string
.Format("Welcome {0} with the password:{1}", bu.UserName, bu.Password)); }


The method SendAuthHeader as shown below simply sends the required headers to the browser, making it prompt the user for a user name and password. When using this code change the line where the WWW-Authenticate header is added and change the "Secure Area" to what you want, i.e. your application name.

SendAuthHeader:
/// <summary>
/// Sends the Unauthorized header to the user, telling the user to provide a valid username and password
/// </summary>
/// <param name="context">The context.</param>
private void SendAuthHeader(HttpApplication context)
{
    context.Response.Clear();
    context.Response.StatusCode = 401;
    context.Response.StatusDescription = "Unauthorized";
    context.Response.AddHeader("WWW-Authenticate", "Basic realm=\"Secure Area\"");
    context.Response.Write("401 baby, please authenticate");
    context.Response.End();
}


The login box for the above code will look something like the one below if you are using firefox.



The method below handles the authorization part, i.e. checking whether or not the user is allowed to do what he is trying to do. At this point the user is already logged on, and we know that the user is a valid user, so all we do is grapping the user from the HttpApplication and asking the implementation of the IAuthProvider whether or not the user is allowed to do this request.

If the user is not allowed to do what he is trying to do, we send a http response back indicating that the user have no permissions to do what he is doing.

If the user is allowed, then we just let things flow.

context_AuthorizeRequest:
/// <summary>
/// Handles the AuthorizeRequest event of the context control.
/// </summary>
/// <param name="sender">The source of the event.</param>
/// <param name="e">The <see cref="System.EventArgs"/>
/// instance containing the event data.</param>
void context_AuthorizeRequest(object sender, EventArgs e) { HttpApplication context = sender as HttpApplication; BasicUser bu = context.User.Identity as BasicUser; if (!authProvider.IsRequestAllowed(context.Request, bu)) { SendNotAuthorized(context); } }


The method below simply sends the correct headers, telling the browser that the user is not allowed to do what he is trying, and therefore the browser should not retry.

SendNotAuthorized:
/// <summary>
/// Sends the not authorized headers to the user
/// </summary>
/// <param name="context">The context.</param>
private void SendNotAuthorized(HttpApplication context)
{
    context.Response.Clear();
    context.Response.StatusCode = 403;
    context.Response.StatusDescription = "Forbidden";
    context.Response.Write("403 baby, You are not allowed to see this");
    context.Response.End();
}


The method TryAuthenticate is where the "magic" happens, this is where the basic authentication part is checked, and then provided the user actually sent a username and password, we ask the IAuthProvider whether or not the user is a valid user.

If we get a go from the IAuthProvider that the username and pasword is a valid combination, then we inject an implementation of the IBasicUser into the HttpApplication for further use in the application. Then we proceed to ask whether or not the user is allowed to do what he is doing, and if everything checks out okay we return true, otherwise we return false.

Please note that the IBasicUser that we put into HttpApplication can be accessed from any asp.net pages, just by accessing the Page's property called User, so its a neat way to inject information about the current user into the standard objects of asp.net.

TryAuthenticate:
/// <summary>
/// Tries to authenticate the user
/// </summary>
/// <param name="context">The context.</param>
/// <returns></returns>
private bool TryAuthenticate(HttpApplication context)
{
    string authHeader = context.Request.Headers["Authorization"];
    if (!string.IsNullOrEmpty(authHeader))
    {
        if (authHeader.StartsWith("basic ", StringComparison.InvariantCultureIgnoreCase))
        {

            string userNameAndPassword = Encoding.Default.GetString(
Convert.FromBase64String(authHeader.Substring(6))); string[] parts = userNameAndPassword.Split(':'); IBasicUser bu = null; if (authProvider.IsValidUser(parts[0], parts[1], out bu)) { context.Context.User = new GenericPrincipal(bu, new string[] { }); if (!authProvider.IsRequestAllowed(context.Request, bu)) { SendNotAuthorized(context); return false; } return true; } } } return false; }


The method below simply tries to authenticate the user on each authentication event, simple as that, and by using the method TryAuthenticate.

context_AuthenticateRequest:
/// <summary>
/// Handles the AuthenticateRequest event of the context control.
/// </summary>
/// <param name="sender">The source of the event.</param>
/// <param name="e">The <see cref="System.EventArgs"/> 
/// instance containing the event data.</param>
void context_AuthenticateRequest(object sender, EventArgs e) { HttpApplication context = sender as HttpApplication; TryAuthenticate(context); }


See, that wasen't so hard, so what you need to do to make this work for is simply:

  • Implement IAuthProvider using your own database of users, your own xml structure or what ever means of authenticating and authorizing the users.
  • Add the Module to your application by editing the web.config and putting the following lines into the web.config
Adding module to web.config:
<httpModules>
  <add name="BasicAuthenticationModule" 
type="Smithfamily.Blog.Samples.BasicAuthenticationModule"/> </httpModules>


Naturally there will be other modules present, just inject the module line as the last element in the <httpModules> collection.

  • Configure the module by adding the following lines to the web.config. Please remember to add the full name, i.e. Your.NameSpace.YourClassName

 

Web.config configuration of BasicAuthenticationModule:
<appSettings>
<add key="Smithfamily.Blog.Samples.BasicAuthenticationModule.AuthProvider"
value="<enter fully qualified name for your implementation of IAuthProvider"/>
</appSettings>



That should be in, just allow anonymous access in the IIS configuration and remove all other authentication options in the IIS, and you should be set.

I have seen cases where it still dosen't work, and if thats the case, try adding the following items to the Web.config.

  • Turn on the authentication module by adding the <authentication> element with the mode None, its strange, but it has to be none, otherwise it will use one of the built in, which kind of defeats the purpose of this module :)
  • Put in the <authorization> element and say that all anonymous users is not allowed

 

Turn on authentication in Web.config:
 <system.web>
      <authentication mode="None" />
      <authorization>
        <deny users="?" />
      </authorization>



I hope this gave you an insight in how you can implement basic authentication pretty easily, and remember you can implement your own IBasicUser, and put all other kinds of stuff in there like items from your application, and then you have access to everything by calling the Page.User property, like:

Accessing the current user in asp.net markup:
<%IBasicUser user = User.Identity as IBasicUser;
Response.Write(user.UserName)%>



Please let me know if this example dosen't work for you and I will try to help you make it work :)

SQL Server subscriptions on non standard port

When setting up replication using publications and subscriptions, it's not easy to do if your publishing server is running on a different port number than the standard port number (1433).

If you try to enter the server name as SERVERNAME,port number you get an error stating that you cannot use ip addresses, aliases or other strange names.

So the only option you are left with is to use the REAL name of the SQL server.

So lets say your server's SQL name is: COMPANYSQL01 you need to use that server name, but if you are running on a different port number or even with multiple instances on the server, each running on a different port, you are in BIG trouble.

Lets take the following configuration.

You have a SQL server 2005, configured with the name COMPANYSQL01.

You have two instances on that server, INSTANCE1 and INSTANCE2

INSTANCE1 runs on the standard port 1433, and you have configured the next instance INSTANCE2 to run on 1434.

So normally you would connect to the server instance2, by entering COMPANYSQL01\INSTANCE2,1434

But that's not possible in Subscription configuration, since you are not allowed to enter a port number.

So what you need to do, is the following:
1. Connect to the instance you want to create a subscription from:

2. Open a query window and enter the following: SELECT @@SERVERNAME (Should yield COMPANYSQL01\INSTANCE2) with the example above

3. Open the Sql Server Configuration Manager on the server that wants to be a subscriber.

3. Expand the SQL Native client configuration (32bit)

4. Expand Aliases, and create a new alias:

4.1 For alias name enter COMPANYSQL01\INSTANCE2
4.2 In port number you enter the non standard port number, i.e. 1434 in this example
4.3 Protocol leave at TCP/IP
4.4 Enter COMPANYSQL01\INSTANCE2 as server name.
4.5 If server is a 64bit machine repeat steps 4 - 4.4 in the SQL Native client configuration in Sql Server Configuration Manager




If you do not have name resolution in place on your subscriber server, i.e. you cannot ping COMPANYSQL01, then you need to fix that, either by updating your DNS or by adding a host entry in your hosts file (C:\Windows\System32\Drivers\etc\hosts ). The name resolution issues can be because your SQL server is on another network than your subscriber, i.e. your subscriber is on your local LAN, and the server that you want subscriptions from is in your DMZ.

Even though the subscription configuration explicitly tells you that aliases does not work with subscriptions, it does if you use the same name for the alias as the real name.
Its strange I know, but it works, it really does.

DateTime time zone conversion with .NET 3.5

Some time ago I wrote a blog post about Timezone conversion and handling within .NET which sucked at that time.

With .NET 3.5 its better, in fact much better, its actually usable :)

In .NET 1.1 and 2.0 there were not way to convert DateTime objects between different time zones to handle all the hassle of adding minutes, handling summer winter time, etc.

There is now. Its not perfect, but it does the job.

Its pretty straightforward.

I will provide an example, which should give you an idea of how to use in real life.

The example below takes two DateTime objects. One UTC Datetime, and one Danish DateTime.

Both DateTime object are created with the same numbers, one is just specified as UTC, the other one as local time.

//Create DateTime with UTC Kind specified, and with Current time in UTC
DateTime utcNow = new DateTime(2008, 8, 16, 21, 42, 32, DateTimeKind.Utc);

//Create danish local DateTime
DateTime dkNow = new DateTime(2008, 8, 16, 21, 42, 32, DateTimeKind.Local); ;
//Find timezone info for egypt
TimeZoneInfo egypt = TimeZoneInfo.FindSystemTimeZoneById("Egypt Standard Time");
//Convert danish local time to egypt time
DateTime egyptTime1 = TimeZoneInfo.ConvertTimeBySystemTimeZoneId(dkNow, "Egypt Standard Time");
//Convert utc time to egypt local time
DateTime egyptTime2 = TimeZoneInfo.ConvertTimeFromUtc(utcNow, egypt);

//Write out initial DateTime objects
Console.WriteLine("{0} {1}", dkNow, dkNow.ToString("%K"));
//Writes out 16-08-2008 21:42:32 +02:00
Console.WriteLine("{0} {1}", utcNow, utcNow.ToString("%K"));
//Writes out 16-08-2008 21:42:32 Z
//Write out the results
Console.WriteLine("Danish NOW converted to Eqypt time:{0} {1}", egyptTime1,

DateTime.SpecifyKind(egyptTime1, DateTimeKind.Local).ToString("%K"));
//Writes out Danish NOW converted to Eqypt time:16-08-2008 22:42:32 +02:00
Console.WriteLine("UTC NOW converted to Eqypt time:{0} {1}", egyptTime2,
DateTime.SpecifyKind(egyptTime2, DateTimeKind.Local).ToString("%K"));
//Writes out UTC NOW converted to Eqypt time:16-08-2008 00:42:32 +02:00

As you can clearly see of the example it is fairly easy to convert DateTime's between timezones, but there is just one question that springs to mind.

Where do you get the list of Timezone ID's?

Well I will provide you with one here, but its a crappy solution Microsoft have chosen for timezone names.

 

Timezone IDUTC offset offset
Morocco Standard Time 0:0
GMT Standard Time 0:0
Greenwich Standard Time 0:0
W. Europe Standard Time 1:0
Central Europe Standard Time 1:0
Romance Standard Time 1:0
Central European Standard Time 1:0
W. Central Africa Standard Time 1:0
Jordan Standard Time 2:0
GTB Standard Time 2:0
Middle East Standard Time 2:0
Egypt Standard Time 2:0
South Africa Standard Time 2:0
FLE Standard Time 2:0
Israel Standard Time 2:0
E. Europe Standard Time 2:0
Namibia Standard Time 2:0
Arabic Standard Time 3:0
Arab Standard Time 3:0
Russian Standard Time 3:0
E. Africa Standard Time 3:0
Georgian Standard Time 3:0
Iran Standard Time 3:30
Arabian Standard Time 4:0
Azerbaijan Standard Time 4:0
Caucasus Standard Time 4:0
Armenian Standard Time 4:0
Afghanistan Standard Time 4:30
Ekaterinburg Standard Time 5:0
Pakistan Standard Time 5:0
West Asia Standard Time 5:0
India Standard Time 5:30
Sri Lanka Standard Time 5:30
Nepal Standard Time 5:45
N. Central Asia Standard Time 6:0
Central Asia Standard Time 6:0
Myanmar Standard Time 6:30
SE Asia Standard Time 7:0
North Asia Standard Time 7:0
China Standard Time 8:0
North Asia East Standard Time 8:0
Singapore Standard Time 8:0
W. Australia Standard Time 8:0
Taipei Standard Time 8:0
Tokyo Standard Time 9:0
Korea Standard Time 9:0
Yakutsk Standard Time 9:0
Cen. Australia Standard Time 9:30
AUS Central Standard Time 9:30
E. Australia Standard Time 10:0
AUS Eastern Standard Time 10:0
West Pacific Standard Time 10:0
Tasmania Standard Time 10:0
Vladivostok Standard Time 10:0
Central Pacific Standard Time 11:0
New Zealand Standard Time 12:0
Fiji Standard Time 12:0
Tonga Standard Time 13:0
Azores Standard Time -1:0
Cape Verde Standard Time -1:0
Mid-Atlantic Standard Time -2:0
E. South America Standard Time -3:0
Argentina Standard Time -3:0
SA Eastern Standard Time -3:0
Greenland Standard Time -3:0
Montevideo Standard Time -3:0
Newfoundland Standard Time -3:-30
Atlantic Standard Time -4:0
SA Western Standard Time -4:0
Central Brazilian Standard Time -4:0
Pacific SA Standard Time -4:0
Venezuela Standard Time -4:-30
SA Pacific Standard Time -5:0
Eastern Standard Time -5:0
US Eastern Standard Time -5:0
Central America Standard Time -6:0
Central Standard Time -6:0
Central Standard Time (Mexico) -6:0
Mexico Standard Time -6:0
Canada Central Standard Time -6:0
US Mountain Standard Time -7:0
Mountain Standard Time (Mexico) -7:0
Mexico Standard Time 2 -7:0
Mountain Standard Time -7:0
Pacific Standard Time -8:0
Pacific Standard Time (Mexico) -8:0
Alaskan Standard Time -9:0
Hawaiian Standard Time -10:0
Samoa Standard Time -11:0
Dateline Standard Time -12:0

 

Naturally the best solution would have been to use the Zoneinfo names for timezones since they are a lot more intuitive, and a lot of websites already use them out there, and oh btw all unix/linux/macos out there as well :) http://en.wikipedia.org/wiki/Zoneinfo