Thread in C# - Part 16: Non-Blocking Synchronization

(Post 30/11/2007) Earlier, we said that the need for synchronization arises even the simple case of assigning or incrementing a field. Although locking can always satisfy this need, a contended lock means that a thread must block, suffering the overhead and latency of being temporarily descheduled. The .NET framework's non-blocking synchronization constructs can perform simple operations without ever blocking, pausing, or waiting. These involve using instructions that are strictly atomic, and instructing the compiler to use "volatile" read and write semantics. At times these constructs can also be simpler to use than locks.

Atomicity and Interlocked

A statement is atomic if it executes as a single indivisible instruction. Strict atomicity precludes any possibility of preemption. In C#, a simple read or assignment on a field of 32 bits or less is atomic (assuming a 32-bit CPU). Operations on larger fields are non-atomic, as are statements that combine more than one read/write operation:

class Atomicity {
static int x, y;
static long z;

static void Test() {
long myLocal;
x = 3; // Atomic
z = 3; // Non-atomic (z is 64 bits)
myLocal = z; // Non-atomic (z is 64 bits)
y += x; // Non-atomic (read AND write operation)
x++; // Non-atomic (read AND write operation)
}
}

Reading and writing 64-bit fields is non-atomic on 32-bit CPUs in the sense that two separate 32-bit memory locations are involved. If thread A reads a 64-bit value while thread B is updating it, thread A may end up with a bitwise combination of the old and new values.

Unary operators of the kind x++ require first reading a variable, then processing it, then writing it back. Consider the following class:

class ThreadUnsafe {
static int x = 1000;
static void Go () { for (int i = 0; i < 100; i++) x--; }
}

You might expect that if 10 threads concurrently ran Go, then x would end up 0. However this is not guaranteed, because it’s possible for one thread to preempt another in between retrieving x’s current value, decrementing it, and writing it back (resulting in an out-of-date value being written).

One way to solve to these problems is to wrap the non-atomic operations around a lock statement. Locking, in fact, simulates atomicity. The Interlocked class, however, provides a simpler and faster solution for simple atomic operations:

class Program {
static long sum;

static void Main() { // sum

// Simple increment/decrement operations:
Interlocked.Increment (ref sum); // 1
Interlocked.Decrement (ref sum); // 0

// Add/subtract a value:
Interlocked.Add (ref sum, 3); // 3

// Read a 64-bit field:
Console.WriteLine (Interlocked.Read (ref sum)); // 3

// Write a 64-bit field while reading previous value:
// (This prints "3" while updating sum to 10)
Console.WriteLine (Interlocked.Exchange (ref sum, 10)); // 10

// Update a field only if it matches a certain value (10):
Interlocked.CompareExchange (ref sum, 123, 10); // 123
}
}

Using Interlocked is generally more efficient that obtaining a lock, because it can never block and suffer the overhead of its thread being temporarily descheduled.

Interlocked is also valid across multiple processes – in contrast to the lock statement, which is effective only across threads in the current process. An example of where this might be useful is in reading and writing into shared memory.

Memory Barriers and Volatility

Consider this class:

class Unsafe {
static bool endIsNigh, repented;

static void Main() {
new Thread (Wait).Start(); // Start up the spinning waiter
Thread.Sleep (1000); // Give it a second to warm up!
repented = true;
endIsNigh = true;
Console.WriteLine ("Going...");
}

static void Wait() {
while (!endIsNigh); // Spin until endIsNigh
Console.WriteLine ("Gone, " + repented);
}
}

Here's a question: can a significant delay separate "Going..." from "Gone" – in other words, is it possible for the Wait method to continue spinning in its while loop after the endIsNigh flag has been set to true? Furthermore, is it possible for the Wait method to write "Gone, false"?

The answer to both questions is, theoretically, yes, on a multi-processor machine, if the thread scheduler assigns the two threads different CPUs. The repented and endIsNigh fields can be cached in CPU registers to improve performance, with a potential delay before their updated values are written back to memory. And when the CPU registers are written back to memory, it’s not necessarily in the order they were originally updated.

This caching can be circumvented by using the static methods Thread.VolatileRead and Thread.VolatileWrite to read and write to the fields. VolatileRead means “read the latest value”; VolatileWrite means “write immediately to memory”. The same functionality can be achieved more elegantly by declaring the field with the volatile modifier:

class ThreadSafe {
// Always use volatile read/write semantics:
volatile static bool endIsNigh, repented;
...

If the volatile keyword is used in preference to the VolatileRead and VolatileWrite methods, one can think in the simplest terms, that is, "do not thread-cache this field!"

The same effect can be achieved by wrapping access to repented and endIsNigh in lock statements. This works because an (intended) side effect of locking is to create a memory barrier – a guarantee that the volatility of fields used within the lock statement will not extend outside the lock statement’s scope. In other words, the fields will be fresh on entering the lock (volatile read) and be written to memory before exiting the lock (volatile write).

Using a lock statement would in fact be necessary if we needed to access the fields end and endIsNigh atomically, for instance, to run something like this:

lock (locker) { if (endIsNigh) repented = true; }

A lock may also be preferable where a field is used many times in a loop (assuming the lock is held for the duration of the loop). While a volatile read/write beats a lock in performance, it's unlikely that a thousand volatile read/write operations would beat one lock!

Volatility is relevant only to primitive integral (and unsafe pointer) types – other types are not cached in CPU registers and cannot be declared with the volatile keyword. Volatile read and write semantics are applied automatically when fields are accessed via the Interlocked class.

If one has a policy always of accessing fields accessible by multiple threads in a lock statement, than volatile and Interlocked are unnecessary.

(Sưu tầm)

Công nghệ khác:

Thread in C# - Part 15: Local Storage	Thread in C# - Part 14: Timers
Thread in C# - Part 13: Asynchronous Delegates	Thread in C# - Part 12: Thread Pooling
Thread in C# - Part 11: ReaderWriterLock	Thread in C# - Part 10: BackgroundWorker
	Xem tiếp

Lịch khai giảng của hệ thống

Ngày	Giờ	T.Tâm
TP Hồ Chí Minh
Hà Nội