The following is an archive of the first week of daily quizzes which I have been sending to my colleagues. They are intended to be a exercises for .NET developers to test fundamentals and gotchas in the framework and C# programming language. I will be posting weekly quiz archives going forward.
Daily Quiz #001
Can you spot the potential deadlock in this code?
private object _lock = new object();
public event EventHandler FileCreated;
protected void GenerateNewFile(string filename) {
lock (_lock) {
File.Create(filename);
_currentFile = new FileInfo(filename);
FileCreated(this, EventArgs.Empty);
}
}
private FileInfo _currentFile;
public FileInfo CurrentFile {
get {
lock (_lock)
return _currentFile;
}
}
}
How would you work around this? What principles or lessons can we learn from this?
Answer
Say the client code attaches this event handler:
Console.WriteLine(_fileMonitor.CurrentFile.FullPath);
}
You may execute this code and it seems fine. In a single-threaded application this code will execute without deadlocking.
Now try running the FileMonitor in a background thread and marshalling the event handler code to you UI thread in a Windows Forms / WPF application.
When the UI thread accesses FileMonitor.CurrentFile it will try to acquire the synchronization lock around _currentFile. The event source has this lock and will not return until the event handler returns, which can’t happen while the lock cannot be acquired – you are deadlocked.
Why doesn’t this code deadlock in a single-threaded application? According to the C# specification (8.12):
“While a mutual-exclusion lock is held, code executing in the same execution thread can also obtain and release the lock. In contrast, code executing in other threads is blocked from obtaining the lock until the lock is released.”
What is the lesson here? Never call unknown code from inside a locked statement block. Unknown code, by definition, could call anything within your data structures. It’s foolish to try to design around this – avoid the problem by never calling unknown code while locked.
Unknown code includes events, any delegates (Action
Daily Quiz #002
You’re reading a CSV file using the following methods, which returns a sequence of sequences of numbers:
var txt = reader.ReadLine();
while (txt != null) {
yield return txt;
txt = reader.ReadLine();
}
}
public static int Parse(this string input) {
int val;
return int.TryParse(input, out val) ? val : default(int);
}
public static IEnumerable<IEnumerable<int>> ReadCsv(TextReader reader) {
var lines = from line in ReadLines(reader) select line.Split(',');
var result = from line in lines
select from num in line
select item.Parse()
return result;
}
You call the code like so:
// close the file when we're done reading
using (var reader = new StreamReader("file.csv")) {
numberLines = ReadCsv(reader);
}
// do stuff with the results
foreach (var line in numberLines) {
foreach (var num in line) {
Console.WriteLine(num);
}
}
The application throws an exception. Without compiling the code, where, and why? Assume the file exists and is readable and well-formatted.
Answer
We tried to read from the file after closing it. LINQ expressions are not executed eagerly, but are accessed on demand. Because the TextReader was placed inside a using block but the code was executed later, when we tried to access the contents of the file in our query expression the file was no longer available.
This was relatively easy to spot in this example, but what if we had returned the IEnumerable
There are several possible solutions to this problem. Consider encapsulating the TextReader resource in a single method – which method would you choose?
Beware of multiple enumerations – if you’re going to enumerate over the resource more than once then be very careful about where you will dispose of it.
For more see Bill Wagner’s More Effective C#, Item 42: Avoid Capturing Expensive Resources. More coming from this chapter soon.
Daily Quiz #003
public int _field1;
}
public class C {
public A a = new A();
public A b = new A();
}
var c = new C();
1. How many allocations did I just make? For how much memory?
2. How many allocations? How much memory?
public int _field1;
}
public class D {
public B a = new B();
public B b = new B();
}
var d = new D();
3. How many allocation? How much memory?
4. This time?
Note: by “allocations” I mean allocations on the heap. Stack space is comparatively cheap.
Answer
See comments for a slight correction on this answer.
1) One allocation for 8 bytes. structs are value types and are allocated in-line.
2) One allocation of 400 bytes (100 * sizeof(a)).
3) Three allocations, one for 12 bytes (assuming 32-bit pointers, 4 bytes per field plus 4 bytes for superclass pointer) and two for 8 bytes each (is someone able to confirm this? I couldn’t find precise documentation in the spec. Email me!)
4) One allocation for 100*sizeof(c). However, each member of the array is null-initialized. Populating the array will take a further 100 allocations for a total of 101 allocations. In number 2 all members are allocated inline and default-initialized. Modifying these already-initialized members is much more efficient than allocating new heap space!
The important part here is the number of allocations, not the memory usage. Memory isn’t the only consideration when deciding whether to use value types or reference types. There is a semantic difference and there is a major performance difference if you’re allocating a lot of memory. Heap allocation is expensive. Inline allocation of value types is much cheaper.
Daily Quiz #004
What are the three main principles that GetHashCode() must ALWAYS follow? Why are these important?
Answer
1. GetHashCode must be instance invariant. Method calls on the object should not change the hash value.
2. Objects that are equal (as defined by operator==) must return the same hash code.
3. Hash functionn should generate a random distribution aceoss all integers.
Daily Quiz #005
I write the following code:
public abstract class Base {
public virtual event EventHandler MyEvent;
public virtual void Foo() {
if (MyEvent != null) {
MyEvent(this, EventArgs.Empty);
}
}
}
public class Derived : Base {
public override event EventHandler MyEvent;
public override void Foo() {
Console.WriteLine("overriden");
base.Foo();
}
}
public class M {
public static void Main(string[] args) {
Base b = new Derived();
b.MyEvent += (o,e) => Console.WriteLine("Event raised");
b.Foo();
}
}
Output:
overriden
Why does it appear that the event is never raised? Hint: this one is quite subtle and involves code generated by the compiler.
Answer:
When we declare the virtual event in Base:
public virtual event EventHandler MyEvent;
the compiler generates (roughly) the following code:
public virtual event EventHandler MyEvent {
[MethodImpl(MethodOptions.Synchronized)]
add { myEvent += value; }
[MethodImpl(MethodOptions.Synchronized)]
remove { myEvent -= value; }
}
Note the private backing field for the event property. When Dervied declares the event override, (almost) the same code is generated in Derived. The private field (Base.myEvent) is now hidden.
Declaring the derived event means that the hidden backing field in Base is no longer assigned when clients attach to the virtual event, and there is no code in Derived to raise the new backing event field.
One possible fix is to override the event using property syntax:
public override event EventHandler MyEvent {
add { base.MyEvent += value; }
remove { base.MyEvent -= value; }
}
// etc.
}
The problem now is that only Base can raise the event. Derived has no access to the private backing field of the event and cannot raise it (just like client code cannot raise an event).
Another possible solution is to raise the event in a virtual method in Base:
public virtual event EventHandler MyEvent;
public virtual void RaiseEvent() {
if (MyEvent != null) MyEvent(this, EventArgs.Empty);
}
}
But at this stage what have you gained by making the event virtual? You can achieve everything you needed to in your virtual event override in the virual method override.
Bottom line: avoid virtual events. It’s not worth the hassle and there’s almost always a better way.
Bonus: declare your events with an empty delegate to avoid having to check for a null event each time you call them:
instead of
By Spa September 11, 2012 - 5:17 am
Daily Quiz: what two mistakes did you make in the answer to quiz #3?
By Martin Doms September 11, 2012 - 1:11 pm
I’m guessing it’s in the memory size usage. The main point of that question was the allocation counts, which I’m fairly sure are correct. I wouldn’t be surprised if I’m incorrect about the overhead for a class in memory. Also I believe I forgot to take into account the overhead of the Array objects (Length field, etc).
Thanks for your feedback.
By Spa September 11, 2012 - 1:45 pm
Yes, the allocation counts are right.
(2) sizeof(C)
(4) sizeof(D)
By Martin Doms September 11, 2012 - 2:01 pm
Thanks!
By Jamie September 18, 2012 - 10:41 pm
Unless im missing something in question 2 you are doing;
var lines = from line in ReadLines(t) select line.Split(‘,’);
shouldn’t this be
var lines = from line in ReadLines(reader) select line.Split(‘,’);
By Martin Doms September 19, 2012 - 9:55 am
Nope you’re absolutely correct Jamie, thanks for pointing that out.
By x September 20, 2012 - 2:59 am
The first answer to quiz 4 is wrong; GetHashCode() is only required to return the same hashcode consistently as long as no modification has been made to an object’s state used for Equals(). If a function call changes the object’s state in such a way for Equals() to be affected then GetHashCode() can change to reflect this (although should be consistent for subsequent requests without state change).
“The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object’s Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.”
By Martin Doms September 20, 2012 - 7:30 am
Where is that quote from, x? I disagree with the point. Let’s say I insert some object into a hashset, then call some method, say, ‘Recalculate()’ on the object that changes the object in such a way that both Equals and GetHashCode are changed but consistent with each other.
Now if I attempt to retrieve my object from the hashset, it won’t be found because it’s stored in a bucket corresponding to a different hashcode. I could also reinsert my object and it would exist twice in my hashset under different buckets. Essentially I have broken the semantics of a hashset and orphaned one object instance because my GetHashCode method is not instance invariant.
GetHashCode should be applied to class invariants. Preferably only immutable (value?) types will be used in hash-based data structures (because frankly lots of mutable classes are nearly meaningless if we only consider their invariants).
Thanks for your input!
By x September 20, 2012 - 9:15 pm
@Martin Doms: MSDN – http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx
By x September 20, 2012 - 9:38 pm
@Martin Doms,
Please note that the reverse issue also exists. By not keeping GetHashCode() and Equals() in sync you are allowing duplicate items (as defined by their Equals method) within your HashSet if mutations occur which changes their Equals() result but not their HashCode. E.g. you could have two identical items within the same bucket!
Idealy you should not store mutable items within a HashSet but instead use immutable items which return a new instance on any attempts to mutate them. This allows you to keep GetHashCode() and Equals() in sync whilst following all Tenants of GetHashCode() and Equals() and keep the HashSet functioning as expected.
By x September 20, 2012 - 9:44 pm
Please ignore the first paragraph above; it’s wrong and contains a fallancy. The second paragraph is however still relevant.