sloppycode.net
.NET Performance + architecture
A set of tips and musings on performance enhancing code and architectural decisions in C#.


Your base class: Abstract or Interface?
This could be seen as a bit of a beginner's topic, but one thing that is often overlooked by the tutorials and books is deciding when to use each one. The fashion for using interfaces for everything often rejects abstract classes outright. If you look at the BCL it gives some good pointers when to use which, and points you in the direction of using both, if needed. There are examples of this inside System.Collections.Generics such as IList<T> and List<T>.

The best advice I've read from various books is to think what type of relationship you want the derived class to have: should it have to perform all tasks the interface wants it to? Look at it from the perspective of the developer who will be creating the derived class. If the class should always implement every method/property/event an interface sets out then interface is the right choice. If you want to give them some flexibility to choose which methods are implemented then make it an abstract class. This also ensures consistency - enforcing an interface where a lot of the methods don't make sense to anyone except the original developer of the interface is a waste, good examples of this can be found in the IE api and the Office toolbars.

A more detailed discussion can be found in the must have "CLR via C#" by Jeffrey Richter which I'll mention a few times throughout this article (I'm not pretending my advice has equal standing to his,I'm just regurgating good advice or my own research rather than having worked for Microsoft with the .NET team).

Ontop of these design choices are of the C# restrictions you have:

- You can inherit from multiple interfaces but only one abstract class.
- A class can implement any number of interfaces; an interface can be implemented by any number of classes.
- Neither can be instantiated.
- Abstract classes shouldn't be used to restrict instantiation. Use a private or internal constructor for this.
- Abstract classes can have fields.
- Abstract classes can implement default functionality.
- Abstract classes normally have at least one abstract method, but can have concrete methods/properties etc (as above)
- Interfaces are restricted to methods, events, properties and indexers only.
- Interfaces allow the same method to be implemented in different ways (Something Java lacks) using explicit implementation, e.g.


- Explicit interface implementations are always private while implicit are public. Implicit will provide the implementation for all interfaces if no explicit signature exists per interface.
- Value types can't inherit from anything except interfaces (for example System.Boolean implements IComparable, IConvertible among others).
- If you implement an interface's method explictly in your value type, it will beed to be boxed to a reference type in order to use it, for example using Int32's implementation of IConvertible.

Abstract methods vs Virtual methods
What IL code is emmited when you have an abstract method, and a virtual method? For the following C#

the equivalent IL is:

and

As you can see, MyConcrete.Run() actually contains op codes while MyAbstract.Run() as you'd expect, contains nothing. Abstract methods are virtual though which probably isn't a huge surprise either. Both of the above use the newslot keyword which indicates that it should override any existing space in the v-table for this method (as they are base classes there is nothing to override).

IComparable, IComparer, Equals, IEquatable<T>,IEqualityComparer
All of these interfaces are used in conjunction with sorting.

IComparer
This is used for custom sorting of objects. Prior to .NET 3.5 it was used primarily for Array.Sort but now finds itself used for sorting with LINQ. It has several default implementations including (as the MSDN page says) Comparer, CaseInsensitiveComparer. When implementing the interface, there is one method to fill:
int Compare(Object x,Object y) 
  
Returns: 
x < y		-1 
x == y		0 
x > y		1 

IComparer<T> allows you to generalise the comparison by type.

Icomparable
IComparable gives you one method to implement:
  
int CompareTo(Object obj) 

The same return value system applies as IComparer. The difference between this and IComparer is IComparable will be implemented on the class that contains your field values, while IComparable derived classes are used for custom sorting. So for example:

The CompareTo method would be the default way of ordering your User class, perhaps comparing by Name. If you then decided that you wanted to sort a list of User objects by another property, say Age, you could write a class that implemented IComparer instead of altering CompareTo inside User. This would perform the custom sorting, for example:

As with IComparer you can generalise the comparison by implementing IComparable<T>

Equals
Besides object comparison, Equals is also used in a few common collection methods. IndexOf(), Contains() both use the Equals() method for equality comparisons. This extends even further with .NET 3.5 and the inclusion of LINQ into the equation, where Equals is used frequently for comparion which leading to...

IEquatable:<T>
This interface was added in .NET 2.0. Its primary use is with value types to avoid the use of ValueType.Equals() (which all structs automatically use if Equals isn't overriden). ValueType.Equals uses reflection to iterate through every field in the struct, checking for equality. If they are all equal to the object passed then true is returned (see the source further on in the article). Implementing IEquatable<T> only does a comparion between your object and <T> which is most likely to be the same type. This is performed frequently with the new methods brought in alongside LINQ in 3.5. Overriding Equals() from ValueType will do a check for any object type, and it's recommended this is overridden with any custom value type alongside implementing IEquatable<T>.

IEqualityComparer<T>
This interface, also added in 2.0, allows you to pass custom object equality checking to a Hashtable, Dictionary and NameValueCollection. You pass in your custom comparison implementation much like you do with IComparer and the Hashtable will use this for equality checks. This is done inside the constructor.

Exception handling
The two often repeated phrases about Exceptions is they should be used for "exceptional behaviour" and sparingly; they are a performance hit on your application and aren't a replacement for logging. These are of course good recommendations, however one point that is often neglated is guidance of when Exceptions should be thrown, and when they are just a performance hit.

The best discussion I've read on the topic is once again inside "CLR via C#" by Jeffrey Richter where he describes more simply how an exception should be thrown when your method doesn't perform the contract it describes it will do. I find this idiom a lot easier to work with than the "exceptional behaviour" metaphor, and his standpoint is that exception handling, while a performance drain, is a lot more beneficial than having code that performs iratically but milliseconds faster.

This obviously has to be used with common sense, always throwing an exception when something unexpected happened such as empty data in a database isn't what the advice is about. However simple template'd exception handling like parameter checking on each method for ArgumentNullExceptions, throwing exceptions when an object's property you're using is null, database loads/saves not performing is far more intuitive (in my view) than having to step through or search through logs especially on production environments.

Jeffrey Richter covers it in a lot more detail in his book, along with talking about the Windows legacy of HRESULT and COM error passing.

Custom Exceptions: ApplicationException vs Exception
(This discussion is also from "CLR via C#" by Jeffrey Richter)
When creating a custom exception, which class should you derive from: ApplicationException or Exception? MSDN and other sources say inherit from ApplicationException, one of the well known C#/.NET standard practices along with the "exceptional behaviour" metaphor mentioned above. The idea behind it is all CLR exceptions inherit from SystemException, and your custom exceptions inherit from ApplicationException making generic application exception handling a case of:
 
catch ( ApplicationException e) 
{ 
     // ... 
}

Microsoft developers working on the .NET framework have voiced opposite views on deriving from ApplicationException.

The problem with the guideline of using ApplicationException is that there are actually classes in the framework that inherit from ApplicationException! Not many but they are still present. More importantly using this guideline stops you from inheriting functionality from existing SystemException classes, forcing you to reinvent functionality that already exists. Take for example ArgumentException. If you wanted to create your own version so that callers know they passed a bad argument to your code, you would be reinventing this class basing it on ApplicationException when really the benefits are few. It would break on systems that catch all ApplicationExceptions, but how many of these exist? I'm happy inheriting from the relevant Exception class in the framework and ignoring the ApplicationException guideline, as both Microsoft engineers and Jeffrey Richter's own inner-knowledge of Microsoft point out that Microsoft want to remove the two-pronged Exception structure but can't due to legacy code.

A longer discussion on the do's and don'ts of custom exceptions can also be found at on this blog which gives a contradictory view to the above.

Boxing
No boxing discussion is complete without a quick summary of value types and reference types:

Value type
- Stored on the stack, unless it is inside a reference type (e.g. a property on a class) where it is then accessed by reference or pointer.
- Passed around by value rather than a memory location (pointer). This means that when you call a method that takes value types as its parameter, the values are copied.
- Not garbage collected.
- Doesn't support inheritence, only implementations of Interfaces (the built in Enum type does infact inherit from ValueType but that's built into the framework
- The default base Equals() method (inside ValueType) is slow as it uses reflection to get every field's value and do a comparison with each of these versus the object.
- Have to be initialised, they can't be null (enforced by the compiler however ILasm is forgiving and generates a 1 byte value if they're null).
- Since .NET 2.0 value types can be nullable by prefixing ? before the type, e.g. int? myvalue; This indicates that zero might be a valid value, so null implies no value has been set.
- The ? shortcut translates into:
 
Nullable n = new Nullable();
n = null; 

Reference types
- Stored on the heap.
- Passed as pointers rather than the actual value;
- Supports inheritence
- The default base Equals() method (inside Object) compares both object's memory location rather than field values, so this is fast.
- Can be null
- Are garbage collected depending on how new they are and if anything has a reference to the instance.

In short, boxing is taking a value type, and turning it into a reference type. This is done by the CLR by copying every field's value into a reference type. Unboxing is taking a boxed reference type and turning it back into a value type.

Implied boxing means you aren't forceable trying to box the valuetype, but rather it happens automatically, such as

Explicit boxing is when you are performing the box to a reference type via a manual cast:
Boxing requires a performance hit, particularly inside loops. Unboxing is not so intense as the values are stored inside the boxed reference (the first point under the Vale type section) type and can just be retrieved back. Most examples of boxing out there show an example turning an int into an object and back, such as:


You would rarely want to achieve something like the above, particularly since the introduction of generics in 2.0. One of the biggest gains generics added to .NET in 2.0 was the ability to add value types to a collection without the need to box. Consider the following:

Here n is being boxed twice, once each time time it's stored inside the ArrayList. The underlying IL is:
IL_000c: box [mscorlib]System.Int32 
IL_0011: callvirt instance int32 [mscorlib]System.Collections.ArrayList::Add(object) 

Changinge the above code to a List<T>:

Compiles into:
 
IL_000c: callvirt instance void class [mscorlib]System.Collections.Generic.List`1::Add(!0) 
IL_0011: nop 

The box keyword is now gone. Generic collections remove the boxing of value types when dealing with lists and collections in general adding huge performance gains. Infact there is a whole thread here about the difference in speed between a none generics ArrayList and a List<T>.

There is also MSDN information about boxing at MSDN online.

The As keyword instead of explicit casting.
Consider the following block of C#:

This is a common branching block that is regularly written. The problem with it is it unnessarily performs two casts. When called repeatedly over a set of thousands of objects this can be a performance hit. It's avoidable however, thanks to the 'As' keyword, illustrated below. No exception is thrown if obj is null when using 'as', whilst a bracketed cast would do this.


String.Equals() vs ==, String.Compare()
Most C# books will tell you from the early chapters that you should always override Equals in your class instead of relying on the base Object.Equals. As mentioned a few times previously, this is essential for value types as the base ValueType.Equals() method (which overrides Object.Equals) uses reflection to decide whether two objects are equal, comparing the field values. The source code for ValueType.Equals (from the Shared Source CLI) is shown below.


For reference types, the base Object.Equals() method does a call to an internal method as its signature shows:

This InternalEquals method can be found in the CLR VM source code here http://www.koders.com/cpp/fid191D8DC42DC6E980F49546893D6E3243A04AB1B1.aspx?s=InternalEquals, or if you prefer to download the source, it's at Shared Source CLI 2.0.

Assuming you know some basic C++, you can see from the InternalEquals source that all sanity checks are done inside the VM rather than the Object.Equals() source show above. These include checking both types aren't null, checking they're the same type (using a method table lookup). The memory addresses of the two reference types are then compared, returning true or false.

For reference types that don't perform operator overloading, == will check whether its memory address is equal to the 2nd object. ReferenceEquals() performs the same task to this. The MSDN docs describe how "To check for reference equality, use ReferenceEquals. To check for value equality, use Equals.". This is a bit contradictory as the Object.Equals() method checks for reference equality, so reference types that don't override Equals() will be checking for reference equality by default.

With String.Equals(), the String class has overridden Object.Equals() so that it does its own comparison checking instead of the default reference check:

EqualsHelper is the internal method that determines if two strings are the same. This use to be inside the VM source in 1.1, but has since been moved into the BCL source code. The first check it does is to determine if they are the same length, returning false if they're not. After this it checks each character (in blocks of 10, working backwards on the string) 2 bytes at a time (i.e. a single character in Unicode).

The String class also overrides the == and != operators:

Now say you do the check on two Unicode strings that don't use the default Western ISO 1252 codepage (the default on Windows US/UK installs):

Equals() does a 2 byte Unicode lookup (strings are stored in Unicode format internally in .NET). I've heard or read in the past that "equals is faster than ==". Here's the IL from above:
IL_0000: nop 
IL_0001: ldstr "Unicode text 1" 
IL_0006: stloc.0 
IL_0007: ldstr "Unicode text 2" 
IL_000c: stloc.1 
IL_000d: ldloc.0 
IL_000e: ldloc.1 
IL_000f: call bool [mscorlib]System.String::op_Equality(string,string) 
IL_0014: ldc.i4.0 
IL_0015: ceq 
IL_0017: stloc.2 
IL_0018: ldloc.2 
IL_0019: brtrue.s IL_0026 
IL_001b: ldstr "Equal from ==" 
IL_0020: call void [mscorlib]System.Console::WriteLine(string) 
IL_0025: nop 
IL_0026: ldloc.0 
IL_0027: ldloc.1 
IL_0028: callvirt instance bool [mscorlib]System.String::Equals(string) 
IL_002d: ldc.i4.0 
IL_002e: ceq 
IL_0030: stloc.2 
IL_0031: ldloc.2 
IL_0032: brtrue.s IL_003f 
IL_0034: ldstr "Equals from Equals()" 
IL_0039: call void [mscorlib]System.Console::WriteLine(string) 
IL_003e: nop 
IL_003f: ret

Is it really faster/more optimised to call Equals()? The code above shows that it's one less method on the stack so arguably yes. The speed difference is unlikely to be noticeable in 99% of applications, and I prefer, and always will use == unless asked not to. But it's down to preference of the programmer, and what languages have influenced them in the past. There is an argument that says you should always use Equals as it describes to other programmers exactly what type of comparison you are doing (ordinal case-insensitive), which I can see I still prefer == though.

MSDN describes what == does (or the Equals method it calls):

"This method performs an ordinal (case-sensitive and culture-insensitive) comparison."

When would you ever need a non-ordinal (the numeric value of the character) comparison, i.e. String.Compare instead of == or Equals()? Inside a method that uses sorting, for example an IComparer implementation. Sorting è,é,ê and ë for is one example. The Equals() method in comparison is more suited to checking your internal strings such as filenames, resource names, database connection strings.



› Home
› C#
› Snippets
› Articles
› Tools
› Taglines
› ASP
› Dictionary Object
› FSO
› Unix cheat sheet
› Gaming
› CSS
› Yak
› Umbraco
› About
› Contact
› Privacy
› Projects
› Search
› Sitemap





Buy on Amazon



Buy on Amazon



Buy on Amazon



Buy on Amazon