.NET Daily Quiz Archive - week 3

Daily Quiz #011

This code works correctly when I debug, but fails in release mode. Why? How can I fix it?

using System;
using System.Threading;
internal class Program
{
    public static void Main(string[] args)
    {
        bool complete = false;
        var t = new Thread(() =>
        {
            int count = 0;
            while (true)
            {
                if (complete)
                {
                    break;
                }
                count++; // Do some work.
            }
        });
        t.Start();
        Thread.Sleep(200);
        complete = true;
        if (!t.Join(2000))
        {
            t.Abort();
            Console.WriteLine("Fail - background thread did not complete.");
        }
        else
        {
            Console.WriteLine("Success");
        }
        Console.ReadLine();
    }
}

Answer:

The C# compiler optimizer only ever considers the single-threaded scenario. In this case complete is loaded into a register and never re-read because as far as the compiler is concerned it is never modified in the context of the closure in which it is checked.

There are two possible solutions — you can hoist the variable into an instance field and mark it as volatile:

volatile bool complete = false;
public static void Main(string[] args)
{
  // Etc.
}

The volatile keyword tells the compiler to never optimize away read access to this field (method variables cannot be marked as volatile, hence the instance field). In other words, every time the value of the field is accessed it is always read from main memory. Note that in this case even hoisting the variable to a non-volatile field fixes the issue. I can’t explain why this is, but you can’t rely on this behaviour. See C# spec 10.4.3.

The other solution to this issue is to use either a lock or MemoryBarrier object, but in this case this is overkill and I wouldn’t advise it. Don’t worry, more on MemoryBarrier in tomorrow’s quiz.


Daily Quiz #012

Given the following class:

class MyClass
{
    int _result;
    bool _complete;
    void Thread1()
    {
        _answer = 123;
        _complete = true;
    }
    void Thread2()
    {
        if (_complete) Console.WriteLine(_answer);
    }
}

If we execute methods Thread1 and Thread2 on separate concurrent threads, is it ever possible for Thread2 to print 0 instead of 123? If so, how does this happen?

Answer:

This is another case of the C# compiler only considering the single-threaded case when optimising your code. In some cases the compiler will re-order the execution of a method if it detects that the order of execution is not important. The CLR, compiler or even your CPU may also introduce caching optimisations that result in variable assignments being invisible to some threads for some time.

These optimisations will never affect single-threaded code, but beware of shared memory between threads for this reason.

The solution to this issue is to apply memory barriers around critical pieces of code. For our example:

class MyClass
{
    int _result;
    bool _complete;
    void Thread1()
    {
        _answer = 123;
        Thread.MemoryBarrier();
        _complete = true;
        Thread.MemoryBarrier();
    }
    void Thread2()
    {
        Thread.MemoryBarrier();
        if (_complete) Console.WriteLine(_answer);
        Thread.MemoryBarrier();
    }
}

A memory barrier indicates an area of code that must be executed in the order in which is it written. For (much, much) more detail, I recommend this excellent website. If you’re going to be writing a lot of asynchronous/parallel code then it’s an excellent resource.

One more important point — a lot of .NET code will implicitly generate memory barriers. These include lock statements, Interlocked members and anything that relies on signalling (including Task constructs). There is a performance overhead in using MemoryBarriers, but in almost all cases it’s negligible and worth taking.


Daily Quiz #013

Today we're moving away from multi-threading and into: Language fundamentals! Consider the following code:

using System;
public class MyFirstType {
       // some stuff
}
public class MySecondType {
       private MyFirstType _inner;
       // implicit conversion to MyFirstType
       public static implicit operator MyFirstType(MySecondType t) {
              return t._inner;
       }
       // some stuff
}
public static class MyFactory {
       public static object GetObject() {
              return new MySecondType();
       }
}
public class Program {
       public static void Main(string[] args) {
              object o = MyFactory.GetObject();
              // first conversion
              try {
                     MyFirstType t1 = (MyFirstType)o;
                     // do stuff
              }
              catch {
                     // deal with conversion error
              }
              // second conversion
              MyFirstType t2 = o as MyFirstType;
              if (t2 != null) {
                     // do stuff
              }
              else {
                     // deal with conversion error
              }
       }
}

Will either of these conversions succeed? If no, why not? If yes, which one(s) and how?

Answer:

Both of these conversions will fail.

First conversion:
Casts will apply implicit user-defined conversions, so you may expect this to succeed. However implicit conversions are applied only at compile time. If an object's compile-time type is not known to have an implicit conversion or inheritance relationship with the target type then the cast will fail. In this case the compile-time type of the object is System.Object, the compiler is not aware of an implicit conversion, so it fails. This does not cause a compiler error, because the compiler will allow any cast that it cannot prove will fail.

Second conversion:
The is and as operators do inspect the runtime type of an object, but they do not apply implicit conversions. These operators only care about inheritance relationships and boxing/unboxing conversions between types. Just because a MySecondType can be a MyFirstType doesn't mean that it is a MyFirstType.


Daily Quiz #014

Some people believe that simple class data should be exposed in fields and converted to properties when required. Why is this a terrible, terrible idea? (There are multiple reasons, but you should be able to justify them well). Note: I’m not asking for the reasons that properties are superior to fields — this is obvious. Think of the case of trying to change a field to a property in an existing (potentially large, disparate, distributed) application.

Answer:

The main issue here is binary compatibility. Access to a field generates different IL to property access. This means that if I deploy two assemblies, one of which references a field in the other, and I change that field to a property, then the first assembly will break unless it is compiled and redeployed against the changed one. This isn’t a problem if you always deploy your assemblies together, but it breaks if you are deploying assemblies separately.

Another problem is that fields and properties are inherently different, and they follow different rules in the C# language. For example, fields may be used as ref parameters, while properties may not. Even worse, there are subtle cases where changing a field to a property changes the functional meaning of code without issuing any warnings or errors. Consider this code:

using System;
public struct MyValueType
{
    public int MyThing { get; set; }
    public void ChangeMyThing(int newThing)
    {
        MyThing = newThing;
    }
}
public class MyReferenceType
{
    public MyValueType Value;
}
class Program
{
    static void Main(string[] args)
    {
        MyReferenceType a = new MyReferenceType();
        a.Value.ChangeMyThing(123);
        Console.WriteLine(a.Value.MyThing);
        Console.ReadLine();
    }
}

This works fine — the output is ‘123’. But if we change the line

public MyValueType Value;

to
public MyValueType Value { get; set; }

The output is ‘0’, because the property getter returns a copy of the original value type.

For more information on the topic of why we should (almost) always be using properties for public-facing interfaces, check out Jon Skeet’s article, Why Properties Matter. It’s an easy read and well worth it.


Daily Quiz #015

using System;
using System.Collections.Generic;
public struct MyValueType
{
    private string _name;
    public string Name
    {
        get { return _name; }
        set { _name = value; }
    }
}
public class Program
{
    public static void Main(string[] args)
    {
        var list = new List();
        var t = new MyValueType();
        t.Name = "Martin";
        list.Add(t);
        var t1 = list[0];
        t1.Name = "Doms";
        Console.WriteLine(list[0].Name);
    }
}

The above program outputs "Martin".

using System;
using System.Collections.Generic;
public interface INameable
{
    string Name { get; set; }
}
public struct MyValueType : INameable
{
    private string _name;
    public string Name
    {
        get { return _name; }
        set { _name = value; }
    }
}
public class Program
{
    public static void Main(string[] args)
    {
        var list = new List();
        var t = new MyValueType();
        t.Name = "Martin";
        list.Add(t);
       var t1 = list[0];
        t1.Name = "Doms";
        Console.WriteLine(list[0].Name);
    }
}

This program outputs "Doms".

Why?

How about the more interesting case - try the same using non-generic collections (System.Collections.ArrayList for example). Cast the output as normal. The outcome is the same, but for a different reason. What's happening in this case?

Answer:

First the solution to the initial problem:
In the first program List allocates an array under the covers of type MyValueType[]. This array is heap-allocated (like all arrays) and each item is a value type. When an item is added, a copy of the added item is placed in the array. When the item is accessed via list[0] no boxing occurs — the List simply copies the first item in the array and returns it. We modify the copy that it returns and throw it away.

The second program results in an allocation of a INameable[] and value types added to the list are boxed. Accessing the first item as an INameable object returns the boxed reference type, and modifications to this reference box type are preserved, hence the “Doms” output. Additionally it should be noted that all items exist on the heap in both cases, because List allocates an array on the heap.

In the problem where we use non-generic collections, things are slightly more interesting. The ArrayList class allocates an object[] on the heap. All value types are boxed in this case, so why does the unboxing not result in the throwing away of the value? The answer is in the boxing object. When an object is boxed, the box reference type implements all interfaces that the original object implements. Because we’re only interested in the interface, the entire box is returned by the ArrayList access — a reference type, not a value type. Without the interface we get yet another unboxed copy. Again, all objects are heap-allocated.

Head hurt yet? The moral of the story from today and yesterday’s quizzes is for heaven’s sake, do not make your value types mutable. Look at this quiz and yesterday’s quiz — all of our problems result from using mutable value types. Just don’t do it.