Cumps Consulting

NSurvey

Posted on 2005.02.20 | No comments | Microsoft Internship, NSurvey, Technology

This post has been imported from the old blog and has not yet been converted to the new syntax yet.

For one of the projects I had to do, I had to create an online survey application, which would be used to gather feedback from Microsoft events. Up until then, feedback was collected by handing out a form and entering the data manually.

As I was given free choice on how to solve this problem I suggested using an existing open-source framework and extending it to meet the requirements. This suggestion was quickly approved because on one side it meant commitment from Microsoft towards open-source and on the other hand it prevented re-inventing the wheel. The project used for this solution is called NSurvey. This provides a survey framework, making it very easy to setup surveys, add questions, add users, do mailings, implement security on a survey-based level, perform statistical operations on the answers and add new functionality by extending on existing classes.

NSurvey is an ASP.NET application, written in C#, which uses a SQL Server back-end, using stored procedures, and various other layers. The installation of NSurvey went very smoothly because of an msi file, placing all files in their correct location.

I started by testing the application and learning the big structure of how it worked. During this small test round, I began thinking on how the final solution would look.

Reflector

Posted on 2005.02.15 | No comments | Microsoft Internship, Technology

This post has been imported from the old blog and has not yet been converted to the new syntax yet.

One of the tools I had in my toolbox was Reflector. This tool is written by Lutz Roeder and allows you to examine a .NET assembly. Through the use of reflection it can display all namespaces, classes, methods, properties, events and fields in the dll.

It is possible to view the code in IL, C#, VB.NET and Delphi. Some of the useful features are the Call and Callee Graph.

The Call Graph shows you which items are used by a given method, while the Callee Graph displays all the methods that call a given method.

Forgot something for blog readers, the url where to get it: http://www.aisto.com/roeder/dotnet/

Boxing/Unboxing In .NET

Posted on 2005.02.14 | Comments (7) | C#, Microsoft Internship

This post has been imported from the old blog and has not yet been converted to the new syntax yet.

In the .NET Framework there is a feature called boxing, which goes hand in hand with unboxing. Boxing is an implicit conversion of a value-type to the object type. While this is a very nice feature when programming normal applications, this overhead becomes a big performance hit when working with algorithms that need to do lots of operations, such as path finding.

When you create a value-type, itâ€™s originally placed on the stack. If you box this value-type, the required memory is allocated on the heap, the value gets copied and an object is placed on the stack, pointing to the boxed value on the heap.

Boxing an int

Unboxing the int again

In a pathfinder I created long ago, in the .NET 1.1 era, a list of points had to be maintained. While this was possible using an ArrayList, it proved faster to write a typed list. The main reason behind this was because the ArrayListâ€™s method signatures all work with objects, causing implicit boxing when using them. Retrieving items from an ArrayList also required unboxing, because that had to be cast back into their Point type.

I wrote a small application to demonstrate the boxing and unboxing taking place, and the performance impact. The test data were 10 million random Point values.

[csharp]// Adding to an ArrayList
for (int i = 0; i < itemsToGenerate; i++) {
arrayList.Add(rndCosts[i]);
}[/csharp]

When we take a look at the IL code this piece generates, we would see the following:

[code]ldobj [System.Drawing]System.Drawing.Point
box [System.Drawing]System.Drawing.Point
callvirt instance int32 [mscorlib]System.Collections.ArrayList::Add(object)[/code]

On the second line, you can see it uses the box operation to box our value-type Point (stored in the typed Point array rndCosts) before calling the Add method.

The solution to this is using generics, available from .NET 2.0 onwards, or writing your own typed list object. As .NET 1.1 had to be used, I chose the second solution. To do this, I used Reflector to look at the ArrayList code, and use that code to create a new list, replacing all instances of object with our required type, Point in this example.

Now we use the same piece of test code with our PointList.
[csharp]// Adding to a typed PointList
for (int i = 0; i < itemsToGenerate; i++) {
pointList.Add(rndCosts[i]);
}[/csharp]

If we look at the IL this piece generates, we notice the following:

[code]ldobj [System.Drawing]System.Drawing.Point
callvirt instance int32 [PointList]CumpsD.Collections.PointList::Add(
valuetype [System.Drawing]System.Drawing.Point)[/code]

The Framework does not use the box operation anymore, effectively getting rid of the overhead we had with our first solution. When we take a look at the test results, we can clearly see the difference between the first solution and the second.

[code]ArrayList: added 10000000 items.
00:00:03.1845792 seconds
PointList: added 10000000 items.
00:00:00.2804032 seconds[/code]

By using the strong typed PointList it was possible to get just a bit more performance out of the code, making the project operate better in the long run. Just for fun, I revisited this test using .NET 3.5, with generics and created a List to store the items. I ended up having similar performance (00:00:00.2915599 seconds) as the typed list I created in .NET 1.1, pretty good considering I wrote it two years ago ;)

The main conclusion here is to avoid any non-typed collection when performance is an issue. Some browny points for our Bits & Bytes Religion when it comes to the happy feeling you get when working with generics :)

Imagine Cup

Posted on 2005.02.13 | No comments | Microsoft Internship

This post has been imported from the old blog and has not yet been converted to the new syntax yet.

Important note (to not get confused):
The text below (and the future ones) is written in the past tense, because they are text that have to go into a book for school, which you have to write after your internship, giving info about what you did. But I'm writing it piece by piece, because otherwise it will be too much to remember, and a lot less detailed :)

For the second year in row, Microsoft organized the Imagine Cup. This is an international contest with various categories. As a part of my internship, I had to compete in the Visual Gaming and the Information Technology category.

The IT category was about diving into solving real life IT problems. With questions about networks, databases and various servers the content was really diversified.

You get 30 minutes to solve 30 questions, scoring 3 for each correct answer, 0 for a blank answer and -1 for a wrong answer. The first 5 people of a country advanced to the next category, which was the first goal of my internship.

After having spent half an hour taking the quiz, I had to wait a day to get the results, to prevent abuse. My score ended up being a 66/90, placing me at a shared first place in the Belgian competition. My first goal was reached.

The Visual Gaming competition was a coding challenge. In this competition you were given an SDK, which included a 2D-viewer, 3D-viewer, documentation and the required assemblies.

In the VG competition you had to write code for robots in a small game, their brains, or also called, AI. The main things I learned thanks to this competition were algorithm-knowledge, performance issues and logic thinking.

Algorithm knowledge was useful to find optimal actions for the robots, for example, the A* algorithm explained above, helped to find the shortest path. Other algorithms such as Traveling Salesman Problem also had to be solved and implemented to gain a strategic advantage in this game.

Performance was a very important aspect in the VG competition. Because your code had a limited time-window it could run in, it had to be very fast, making sure it stayed inside that window. This was the reason why I implemented a binary heap in the pathfinder, and why I made a lot of performance optimizations, such as preventing boxing and unboxing, storing certain 2-dimensional arrays in a 1-dimensional and cutting back on object initializations.

But knowing how the algorithms work and how to tweak for performance alone didnâ€™t do the trick. The difficulty lies in making it all work together and operating as a big unit, working its way to a victory. Thatâ€™s where logic thinking came in, to determine what tactics to use, and which algorithms to use.

After having played with this an afternoon I managed to get 1325 points, which was enough to get to Round 2, and to achieve another internship goal. My personal next goal was to try to score as good as possible in the second round.

(Which I will talk about when the second rounds is on its way :p)

Binary Heaps

Posted on 2005.02.12 | No comments | Microsoft Internship

This post has been imported from the old blog and has not yet been converted to the new syntax yet.

A pathfinder has to be very fast, but when we take a look at the performance we notice that working with the open-list is the bottleneck.

There are several possible data structures to store the open-list. We could use an ArrayList to store the values and keep it sorted using an IComparable interface. With this solution we end up with too much overhead keeping the entire list sorted. After all, the only thing our pathfinder is interested in, is the node with the lowest F-score, it doesnâ€™t care about the other nodes.

A better solution is using a binary heap. In a binary heap, each item has two children with a value higher or equal to itself. Which means in the end the lowest item is at the top of the heap, easily accessible.

One of the nice things of a binary heap is the fact that it can be stored in a 1-dimensional array, making sorting of the heap a very quick operation.

The top of the heap is always stored at index 1. We donâ€™t use the 0-index when working with zero-based arrays.

The children of any given item are always stored at the itemâ€™s location * 2 and the itemâ€™s location * 2 + 1. For example, in the image given above, the item with value 20 is stored at index 3 and its two children can be found at index 6 (3 * 2) and index 7 (3 * 2 + 1).

Adding an item to a binary heap can be achieved by adding the new item at the end of the array and then letting the new item bubble its way up.

This is achieved by comparing the item with its parent, swapping them when the item is smaller then its parent and repeating this until the parent is smaller.

[csharp]
public void Add(Int32 fCost) {
this.binaryHeap[this.numberOfItems] = fCost;

Int32 bubbleIndex = this.numberOfItems;
while (bubbleIndex != 1) {
Int32 parentIndex = bubbleIndex / 2;
if (this.binaryHeap[bubbleIndex] <= this.binaryHeap[parentIndex]) {
Int32 tmpValue = this.binaryHeap[parentIndex];
this.binaryHeap[parentIndex] = this.binaryHeap[bubbleIndex];
this.binaryHeap[bubbleIndex] = tmpValue;
bubbleIndex = parentIndex;
} else {
break;
}
}
this.numberOfItems++;
} /* Add */
[/csharp]

To remove an item from a binary heap, we simply take the item at index 1. But now we have to repair our heap, because there is a gap at the top. To fix this we take the last item and place it at the top, after which we let it sink downwards. This is done by comparing the value with its two children, replacing it with the smallest child and repeating this until the parent is smaller than both children.

[csharp]
public BinaryHeapItem Remove() {
this.numberOfItems--;
Int32 returnItem = this.binaryHeap[1];

this.binaryHeap[1] = this.binaryHeap[this.numberOfItems];

Int32 swapItem = 1, parent = 1;
do {
parent = swapItem;
if ((2 * parent + 1) <= this.numberOfItems) {
// Both children exist
if (this.binaryHeap[parent] >= this.binaryHeap[2 * parent]) {
swapItem = 2 * parent;
}
if (this.binaryHeap[swapItem] >= this.binaryHeap[2 * parent + 1]) {
swapItem = 2 * parent + 1;
}
} else if ((2 * parent) <= this.numberOfItems) {
// Only one child exists
if (this.binaryHeap[parent] >= this.binaryHeap[2 * parent]) {
swapItem = 2 * parent;
}
}
// One if the parent's children are smaller or equal, swap them
if (parent != swapItem) {
Int32 tmpIndex = this.binaryHeap[parent];
this.binaryHeap[parent] = this.binaryHeap[swapItem];
this.binaryHeap[swapItem] = tmpIndex;
}
} while (parent != swapItem);
return returnItem;
} /* Remove */
[/csharp]

A small comparison between an ArrayList and this binary heap implementation gives the following results:

[code]
Binary Heap: added 4000 items.
Time needed: 00:00:00
Lowest F-score: 1
Sorted ArrayList: added 4000 items.
Time needed: 00:00:07.2968750
Lowest F-score: 1

Binary Heap: added 10000 items.
Time needed: 00:00:00.0156250
Lowest F-score: 1
Sorted ArrayList: added 10000 items.
Time needed: 00:00:56.1250000
Lowest F-score: 1
[/code]

Inspiration and some images used from Patrick Lester.

«« Older Posts

Newer Posts »»