Configuration file magic via Smith.BuildExtensions

I am sure everyone have had the "pleasure" of having to maintain configuration files across projects and even solutions, only to copy and paste configuration data between them, to keep them in sync, and have had the same issues that everyone else has had. i.e. Missing variables in one project, missing sections and so forth.

So have I, and in my previous work we used Nant and a custom built script to transform our app.config and web.config into the correct version for the given target we were building.

In a new work we are having the same exact problem, surprise Smile - and instead of "poluting" our code base with nant. (We are running TFS, so nant does not fit well in that) - I decided to build my own MSBuild Task that could do basically the same, i.e. transform a configuration template, exchanging "tokens" or variables with configured elements or values from one or many configuration files.

I have done that now, and you can see it all in its simple splendor at codeplex.

But basically you add a little stuff to the project files of the projects where you want to use the configuration sharing and transformation, and you create your templates for your app.config and web.config and a few files for the variables and the next time you build, you will get a configuration files that matches the Build Target you selected in visual studio - with warnings and errors in the Error List if you have missing configuration variables for a given build target.

The following information is copied from codeplex, where you can see more elaborate examples.

To start using the Smith Build extensions is really easy, simply download the code, build it and copy the Smith.BuildExtensions.dll to a directory of your choosing.

Then either create or copy the provided examples of config files and put those in another directory of your choosing.

Then you need to change all project files that you want transformations for.

Add the following line to the project that you want to have configuration transformations in:

<UsingTask TaskName="ConfigTransformTask" 
AssemblyFile="Smith.BuildExtensions.dll" />

But remember to change the AssemblyFile attribute to point to where you put the compiled Smith.BuildExtensions.dll file.

Uncomment the
<Target Name="BeforeBuild">

target and add the following to the target:

<Target Name="BeforeBuild">
   <ConfigTransformTask ConfigBaseDir="..\Configs" 
Configuration="$(Configuration)" Outputfile=".\App.config" />

Where the ConfigBaseDir is where you have placed the app.config and web.config templates and the build specific settings files.

ConfigTemplate is the name of the template to use for transformation, i.e. if you are doing this in a web project choose your web.config.base.config file, and the app.config.base.config file if its a normal project or test project.

The OutputFile attribute controls what filename to write the file to, i.e. again for a web project use Web.config and App.config for other projects.

To see a full project file example, head over to the Project file example page

To see how to create the xml configuration files, head over to the Xml examples page.

I hope whoever reads this will find it just as exiting that I do, and will be a happy user of it Laughing

Amazon DynamoDB released

Today Amazon announced their next gen nosql db called DynamoDB.

I can't wait to get around playing with it.

I have tried using SimpleDB, and that was a mixed bag of pleasure and pain.

I hope DynamoDB will be more pleasure than pain.

Stay tuned for when I share my experience.

Updates to the memcached client

New updates is available for my memcached client.

I decided to release this update as a proper release in codeplex, since the client contains the features it really needs now.


  • Server monitor that will monitor memcached server nodes and remove them from the cluster if they are dead, but re add them as soon as they become available again.
  • MultiGet implemented, so now you can ask for more than one key at a time. Only caveat to that is that the values have to be of the same type.
  • Gets has been implemented so you can get that CAS value to be used for Check and Set operations
  • Set operation has been implemented so you can unconditionally overwrite values in the memcached server.
  • Performance counters have been implemented, so you can see how busy your server is with doing memcached operations and how long it takes.

You can download the new release at:

Having tried out several web development frameworks, and service frameworks while building restful services, I found that none of them were really suited for the job.

So I decided to build a very simple framework that is intended to make REST services and nothing else. Its not a RPC framework, its meant to be used for REST.

Let me give a very brief overview of why I thought the already established frameworks is not good enough.

MVC is simply too weird for my taste, first of all it uses more or less "automagic" mapping of methods in a controller to the verbs being used. I do not like that, I like to be in absolute control. Secondly you have to return an ActionResult instance from your methods that is wrong in my opinion and hides the real intent of the methods, i.e. it makes much more sense to return the objects that your method found. I think MVC is more meant to build websites and not web services or even REST services.

MVC's async implementation is laughable, seriously who thought up the silly way that you have to incment async operations, why not simply go with the standard BeginXX/EndXX methodology instead of making something really weird. I guess its because real async is kind of hard to wrap your head around.

I have also tried out both WCF and WCF HTTP, which is the next gen version of WCF that is tailored to build web services over http.

WCF and WCF HTTP is pretty good, first of all, its a service framework, its built with services in mind. Its very extensible, although it can be hard to find the exact place to extend if you want to change a particular behaviour. WCF supports asynchronous operations out of the box. You do not have to return a weird result object, but can return whatever you please, and object or void.

The only real reasons why WCF did not cut it with me, was of two simple reasons. You cannot build hiearchical rest services with WCF, i.e. you cannot have a /addressbook/{addressbookid} and let that be served by one class, and then have /addressbook/{addressbookid}/contacts be served by another class. All access to the same root must be served by the same service, which require you to have _ALL_ your methods in one service, which is bad. The other reason is that its not very easy to exchange the serializer of WCF, in fact its so hard, that I do not think the guys that made the framework ever wanted someone to exchange the serializers.

WCF HTTP comes with a nice feature where it looks at the Accept-Types header of the request and serves the correct content type, but if you start tweaking with your own serializers, i.e. lets say you do not like the JsonDataContractSerializer, like so many people does not, and inject your own, then you loose that functionality and have to build that as well.

I also briefly looked at the OpenRasta framework, which looks awesome and supports everything you would ever need, except it does not support asynchronous services, so you loose some scalability if you use that.

All that being said, I decided to build my own simple framework that tries to do all that I needed and its actually very simple to use.

It sill lacks a few features, not something you cannot built yourself into your service implementation but something that will come in time.

I have called my framework and you can find it at supports the following features so far:


  • Automatic content type detection and serving of the requested content type
  • Supports asynchronous and synchronous api
  • Non intrusive, you can use any class as a REST service
  • Simple configuration, only add one http handler and configure the routes and you are good to go
  • You can return object instances from your services and the framework will handle serialization
  • Built in support for ETag / If-None-Match for proxy/browser caching capabilities
  • Plugs into an IOC container easily, so you can extend your REST services as you like


Features missing so far:


  • Authentication support natively
  • Logging support


The missing features is something  you can easily build into the REST service yourself by using interceptors or even just checking the auth headers in your methods, but it is something that should be part of the framework, so that kind of boiler plate code does not clutter your business logic.

To show how easy it is to build a REST service with the framework, I have implemted a Test REST service that is part of the code on codeplex.

Try it out and let me hear what you think :)

Efficient buffering with BufferManager

When tasked with writing code that does i/o to read data into a application for further processing, it is normal that a buffer is created that will hold the chunks of data while data is being transferred from the client/disk or what ever medium the data is coming from.

It is not uncommon to find code similar to the example below.

byte[] buffer = new byte[requestSize];
stream.BeginRead(buffer, 0, requestSize, OnReadComplete, null);


While the code above is okay if your application is not very busy, it might be an issue if you have to process a large amount of requests at the same time or in rapid sucession.

The reason for this is that you with the code above allocates a buffer to hold the data, and that buffer has to be allocated, objects larger than 85k is allocated on the large object heap, and if you allocate a lot of different sized objects your large object heap will be fragmented and might lead to out of memory exceptions.

There are a couple of solutions to prevent this issue.

One is to do your own "memory" management and preallocate 10 large byte arrays and reference those from where you need them, and simply re use them as needed. This will prevent a lof of arrays being created and prevent the fragmentation, since those 10 arrays will stay on the same position on the large object heap, thus preventing the fragmentation.

An easier solution is to use the BufferManager class that was introduced with WCF.

The BufferManager class handles the issue with pre allocating chunks of memory and your application simply requests a chunk of memory and returns it when its done with it.

Rather simple

// Create buffer manager with a max size of 1MB and a max buffer size of 100k
BufferManager bufferManager = BufferManager.CreateBufferManager(1000000, 100000);

// Request a buffer
byte[] buffer = bufferManager.TakeBuffer(100000);

// work with the buffer
stream.BeginRead(buffer, 0, buffer.Length, OnReadComplete, null);
// Release the buffer 


Not only will the buffer manager help migitate the problem with memory fragmentation, it is also much faster to get a preallocated buffer than allocating a buffer each time you need it.

I  created a very simple and not very realistic test, to show the difference. The first example uses allocation of the buffers as needed.

Stopwatch watch = new Stopwatch();

for (int x = 0; x < 1000000; x++)

    byte[] buffer = new byte[100000];
    for (int y = 0; y < 1000; y++)
        buffer[y] = (byte)(y % 4);



On my computer this takes 7541 seconds on average to run.

The next example uses the buffer manager but is doing the exact same "work".

Stopwatch watch = new Stopwatch();
BufferManager bufferManager = BufferManager.CreateBufferManager(100000, 100000);
for (int x = 0; x < 1000000; x++)

    byte[] buffer = bufferManager.TakeBuffer(100000);
    for (int y = 0; y < 1000; y++)
        buffer[y] = (byte)(y % 4);


This example only takes 1390 milliseconds on average to run, thats more than 5 times as fast. Just to allocate the memory.

In real world programs you would not only be allocating memory and doing nothing with it, so the relative performance improvements by switching to using the buffermanager will not be as great as the total time spent allocating memory is probably very low, unless you have a lot of garbage collection going on because of a lot of objects being created and destroyed.

But taking both benefits into considerations, I think it's definately worth using instead of manually allocating buffers to hold your temporary data.

Updates to the asyncronous memcached client

New updates is available for my memcached client.


  • Server monitoring is in place, i.e. if a server node goes down or several requests fail for a given node.  
  • Logging framework has been added, so useful log statements can be added.
Coming updates are:

  • Actually using the information added by the server monitor, to remove a node when it is marked as dead and reintroduce it again, if and when it is marked as alive again.
  • Implement Set - I don't know how I could forget this in the first version, but it's very simple to implement with the current implementation.
  • Implement MultiGet - so you can save a few precious roundtrips if you are lucky enough that all your keys end up on the same server node.
  • Implement stats operation - so you can get some usefull statistics back from the server.


Anyway, check it out at:

If anyone out there is actually using the client or considering it, please let me know, I would really like some feedback.

Reading stuctured files into SQL Server Part 2

My last post presented how you can read a file in a structured format into memory for further processing.

This post will focus on how you easily can transport the contents you just imported into SQL server.

If you want to data in bulk into SQL Server, then the most efficient way of doing that is to use the class System.Data.SqlClient.SqlBulkCopy.

There are two ways you can use SqlBulkCopy, either you give it a DataTable instance with the data represented in the same format and order as the table in the database, or you give it an IDataReader instance, that provides access to the data in the same format as the DataTable would do.

Both methods work just fine, but if you want high performance and efficiency you should not use a DataTable since it will require you to build up a DataTable object, transform your data into a row format, which is inefficient. The most efficient way is to implement an IDataReader on top of your data that you want to import. Naturally if you had to implement your IDataReader instance yourself, then the DataTable approach would probably be faster, since its very easy to understand and most people have used a DataTable before. But lets say you want to insert 1billion rows, then you face the issue that your DataTable simply cannot hold 1billion rows, so you would have to create several instances of a DataTable with chunks of data, which would use up a lot of memory anyway, and furthermore create a lot of objects that would have to be collected by the garbage collector.

By using an IDataReader you only have to provide one row at a time to the SqlbulkCopy class, and you can easily re-use your internal row representation for each instance of the row - this makes it very efficient both in terms of performance since you create less objects, and move less data into memory at the same time. Furthermore the fewer objects you create causes less garbage collection to happen, which is good, since the entire application grinds to a halt each time the garbage collector kicks in.

Now less words and more code, I have created a few classes that help with the IDataReader implementation that I have made.


  • FileDataColumn - A class that is used to describe the format in the record you try to load into the IDataReader.
  • FileDataRecord - An IDataRecord implementation with the possibility to also set the values of the record, not only read data from it.
  • FileDataReader - An IDataReader implementation that uses the FileRecordReader from my last post to provide forward only access to each record as an IDataRecord.



The FileDataColumn class only contains two properties. ColumnName and ColumnType, which is kind of obvious what they are used to, so I will not go into any detail on that class.

The FileDataReader takes a few arguments in its constructor that will enable it to read the data and provide a nice interface to it.


/// <summary>
/// Initializes a new instance of the <see cref="FileDataReader"/> class.
/// </summary>
/// <param name="fileStream">The file stream.</param>
/// <param name="columns">The columns describing the format of the stream for a single record.</param>
/// <param name="recordSeparator">The record separator.</param>
/// <param name="fieldSeparator">The field separator.</param>
/// <param name="fileEncoding">The file encoding.</param>
/// <param name="recordManipulator">The record manipulator.</param>
public FileDataReader(Stream fileStream, 
FileDataColumn[] columns, 
char recordSeparator, 
char fieldSeparator, 
Encoding fileEncoding,
Action<FileDataRecord> recordManipulator)


First argument is the stream where the data is located. In real world scenarios this would be a FileStream variant that would point to the file you want to read - this filestream will be passed onto the FileRecordReader instance that the constructor creates.

Second argument is an array of FileDataColumn objects that describes the record format of the file. They must be in the same order as the fields in the file.

Third argument is the record separator character, i.e. the character that separates the records from each other in the file.

Fourth argument is the field separator character, i.e. the character that separates the fields in the file.

Fifth argument is the encoding of the file, which is important in particular if you want to read text.

Last argument is an action that will be called before each call to Read returns, which will give you an opportunity to modify the data before its being passed onto whatever reads from the reader.

You use the FileDataReader as you would use any other IDataReader, by invoking the Read() Method that will return a bool indicating whether or not the reader was positioned at the next record or not.



IDataReader dataReader = new FileDataReader(s, cols, '\n', ',', Encoding.Unicode);

while (dataReader.Read())
    string fieldValue = (string)dataReader["field"];
    int fieldValue2 = (int)dataReader[2];

And so forth - the beauty of it is that if you do not want to do any processing you can just give the SqlBulkCopy the instance of the FileDataReader and you don't have to do any more work what so ever.

If you need to manipulate each record, you simply provide an Action to the FileDataReader i.e.

Stream s = new MemoryStream(1000);
for (int x = 0; x < 10; x++)
    AddRecordToStream(s, string.Format("{0}\n", (x * 10)));
s.Position = 0;
FileDataColumn[] cols = new[] 
    new FileDataColumn { ColumnName = "First", ColumnType = typeof(int) } 

IDataReader dataReader = new FileDataReader(
    record =>

        int currentValue = record.GetInt32(0);
        record.SetValue(0, currentValue * 2);

for (int x = 0; x < 10; x++)
    Assert.That(dataReader[0], Is.EqualTo(x * 10 * 2), x.ToString());


Nice and easy if you ask me Laughing - naturally you could easily extend and improve my FileDataReader implementation, but this will give you a hint on how you efficiently can read a file into SQL Server if you need to.

To use this reader together with SqlBulkCopy you simply create an instance of the FileDataReader and use it like below:


using (SqlBulkCopy bulkCopy =
                new SqlBulkCopy(destinationConnection))
    bulkCopy.DestinationTableName =

    catch (Exception ex)


I have attached the entire source code project for both this post and the previous one, including integration tests that will show how to use the code.

I hope you enjoy using it, I certainly enjoyed writing the code.

Any questions, post a comment or leave feedback. (14.44 kb)

kick it on

Reading stuctured files into SQL Server Part 1

From time to time we have all probably been tasked with getting a structured file into SQL Server.

It could be a comma separated file, it could be some other delimeter. It does not really matter. What matter is that there are several ways of getting that data into SQL Server, where some are fast and efficient and others slow and sometimes even impossible if you do it wrong.

The obvious way of importing a structured file into SQL server is to either use BCP or to use SQL servers built in BULK INSERT.


FROM 'c:\commaseparatedfilename.csv'


BCP and the built in method is fine when you want to do a one time import, and if you like to stretch it a bit you could even do periodic importing using a maintenance task that every day at a certain time imports a specific file from a location into sql server.

But what if you need to do some processing of the file? Then you are pretty much stuck with writing a program that reads the file and writes the modified records into sql server.

That might be cumbersome task, and what if your file is several gigabytes in size. Then you cannot simply read in the entire file, since your program might run out of memory. So what you should do is simply if possible read one record from the file at a time and process the record and pass it onto sql server.

I have created a few classes to help with that, which I will present in this blog post and the ones to come.

The tasks you need to do to get that file into SQL Server is probably something like:


  1. Read the records out from file, one at a time, as efficiently as possible using as little memory as possible
  2. Parse each record into its different columns resulting in a strongly typed object that can be pased onto SQL Server easily.
  3. Optionally parse each record and its values before its being passed onto SQL Server for storing.


I will present a nice solution to task #1 in this first blog post, and will present a solution to #2 in the next blog post.

For task #1 I have created a nice little class that I call FileRecordReader, which basically have a single method called ReadNextRecord.


The method ReadNextRecord will read the next record and return that as a string and advance its internal positions to the location of the next record in the file.


/// <summary>
/// Reads the next record from the stream.
/// </summary>
/// <returns>The next record from the stream or null if no more records exist.</returns>
public string ReadNextRecord()


The FileRecordReader class takes a few arguments in its constructor that will help it read the file and understand where a record starts and stops.


/// <summary>
/// Initializes a new instance of the <see cref="FileRecordReader"/> class.
/// </summary>
/// <param name="fileStream">The file stream.</param>
/// <param name="recordSeparator">The record separator.</param>
/// <param name="fileEncoding">The file encoding.</param>
public FileRecordReader(Stream fileStream, char recordSeparator, Encoding fileEncoding)


First argument is the stream where the reader should read its data from, which in real life usages should be a FileStream instance.

Second argument is a char that will be used to separate the records from each other - normal use cases would be a newline character \n, but this class supports any arbitrary character that you would like to use, in the case your records contains linebreaks that you would like to retain in the imported data.

Last parameter is the encoding of the file. This is also very important since a UTF-8 or -16 encoded file that gets read using your standard encoding in windows will not look pretty since they will be parsed incorrectly.

If you look at the source code attached you might see that it has a similar way of working as the built in StreamReader class and the method ReadLine - but if you have a different record separator you cannot use StreamReader but have to parse the file yourself.

I have attached the source code to the FileRecordReader class and also a simple test that tests that the class is working.

FileRecordReader.cs (5.33 kb)

FileRecordReaderTests.cs (2.86 kb)


Stay tuned for the next post where I will describe how you can use the above FileRecordReader to present a nice interface that makes it easy to get those records into SQL Server.

kick it on

Updates to the asyncronous memcached client

This evening I updated the asyncronous memcached client that I am building.

I implemented basic performance counters, so the client will populate performance counters for the following items:


  • Average wait time for a free client socket
  • Total cache operations per second
  • Adds per second
  • Appends per second
  • CAS per second
  • Deletes per second
  • Prepends per second
  • Replaces per second
  • Sets per second
  • Errors per second


The performance counters also gave me a nice way to assess the performance of the client and with a soak test program running full throttle, only sleeping one ms between interations and using 50 threads and 25 socket, the code is capable of doing almost 20.000 add operations per second against one memcached server.

That's pretty awesome if you ask me :)

Check it out at - and please do add comments (if anyone is reading this at all )

Asyncronous memcached client

I have been working with distributed caching for about 4 years now, using memcached as the only server.

I have been trying out different memcached clients, and some have been good, others bad.

They have all had the same problem: They have been syncronous implemented, i.e. they have been wasting a lot of theads on simple waits.

I have started a project to create a fully asyncronous memcached client in .NET.

Check out:

Its not production code yet, but its a fully working client, for gets/sets. It just needs some additional features, and I will release a version.