(Post 30/11/2007) Earlier, we said that the
need for synchronization arises even the simple case of assigning or incrementing
a field. Although locking can always satisfy this need, a contended lock
means that a thread must block, suffering the overhead and latency of
being temporarily descheduled. The .NET framework's non-blocking synchronization
constructs can perform simple operations without ever blocking, pausing,
or waiting. These involve using instructions that are strictly atomic,
and instructing the compiler to use "volatile" read and write
semantics. At times these constructs can also be simpler to use than locks.
Atomicity and Interlocked
A statement is atomic if it executes as a single indivisible
instruction. Strict atomicity precludes any possibility of preemption.
In C#, a simple read or assignment on a field of 32 bits or less is atomic
(assuming a 32-bit CPU). Operations on larger fields are non-atomic, as
are statements that combine more than one read/write operation:
class Atomicity {
static int x, y;
static long z;
static void Test() {
long myLocal;
x = 3; // Atomic
z = 3; // Non-atomic (z is 64 bits)
myLocal = z; // Non-atomic (z is 64 bits)
y += x; // Non-atomic (read AND write operation)
x++; // Non-atomic (read AND write operation)
}
}
Reading and writing 64-bit fields is non-atomic on 32-bit
CPUs in the sense that two separate 32-bit memory locations are involved.
If thread A reads a 64-bit value while thread B is updating it, thread
A may end up with a bitwise combination of the old and new values.
Unary operators of the kind x++ require first reading
a variable, then processing it, then writing it back. Consider the following
class:
class ThreadUnsafe {
static int x = 1000;
static void Go () { for (int i = 0; i < 100; i++) x--; }
}
You might expect that if 10 threads concurrently ran
Go, then x would end up 0. However this is not guaranteed, because it’s
possible for one thread to preempt another in between retrieving x’s current
value, decrementing it, and writing it back (resulting in an out-of-date
value being written).
One way to solve to these problems is to wrap the non-atomic
operations around a lock statement. Locking, in fact, simulates atomicity.
The Interlocked class, however, provides a simpler and faster solution
for simple atomic operations:
class Program {
static long sum;
static void Main() { // sum
// Simple increment/decrement operations:
Interlocked.Increment (ref sum); // 1
Interlocked.Decrement (ref sum); // 0
// Add/subtract a value:
Interlocked.Add (ref sum, 3); // 3
// Read a 64-bit field:
Console.WriteLine (Interlocked.Read (ref sum)); // 3
// Write a 64-bit field while reading previous value:
// (This prints "3" while updating sum to 10)
Console.WriteLine (Interlocked.Exchange (ref sum, 10)); // 10
// Update a field only if it matches a certain value (10):
Interlocked.CompareExchange (ref sum, 123, 10); // 123
}
}
Using Interlocked is generally more efficient that obtaining
a lock, because it can never block and suffer the overhead of its thread
being temporarily descheduled.
Interlocked is also valid across multiple processes –
in contrast to the lock statement, which is effective only across threads
in the current process. An example of where this might be useful is in
reading and writing into shared memory.
Memory Barriers and Volatility
Consider this class:
class Unsafe {
static bool endIsNigh, repented;
static void Main() {
new Thread (Wait).Start(); // Start up the spinning waiter
Thread.Sleep (1000); // Give it a second to warm up!
repented = true;
endIsNigh = true;
Console.WriteLine ("Going...");
}
static void Wait() {
while (!endIsNigh); // Spin until endIsNigh
Console.WriteLine ("Gone, " + repented);
}
}
Here's a question: can a significant delay separate "Going..."
from "Gone" – in other words, is it possible for the Wait method
to continue spinning in its while loop after the endIsNigh flag has been
set to true? Furthermore, is it possible for the Wait method to write
"Gone, false"?
The answer to both questions is, theoretically, yes,
on a multi-processor machine, if the thread scheduler assigns the two
threads different CPUs. The repented and endIsNigh fields can be cached
in CPU registers to improve performance, with a potential delay before
their updated values are written back to memory. And when the CPU registers
are written back to memory, it’s not necessarily in the order they were
originally updated.
This caching can be circumvented by using the static
methods Thread.VolatileRead and Thread.VolatileWrite to read and write
to the fields. VolatileRead means “read the latest value”; VolatileWrite
means “write immediately to memory”. The same functionality can be achieved
more elegantly by declaring the field with the volatile modifier:
class ThreadSafe {
// Always use volatile read/write semantics:
volatile static bool endIsNigh, repented;
...
If the volatile keyword
is used in preference to the VolatileRead and VolatileWrite methods,
one can think in the simplest terms, that is, "do not thread-cache
this field!" |
The same effect can be achieved by wrapping access to
repented and endIsNigh in lock statements. This works because an (intended)
side effect of locking is to create a memory barrier – a guarantee that
the volatility of fields used within the lock statement will not extend
outside the lock statement’s scope. In other words, the fields will be
fresh on entering the lock (volatile read) and be written to memory before
exiting the lock (volatile write).
Using a lock statement would in fact be necessary if
we needed to access the fields end and endIsNigh atomically, for instance,
to run something like this:
lock (locker) { if (endIsNigh)
repented = true; }
A lock may also be preferable where a field is used many
times in a loop (assuming the lock is held for the duration of the loop).
While a volatile read/write beats a lock in performance, it's unlikely
that a thousand volatile read/write operations would beat one lock!
Volatility is relevant only to primitive integral (and
unsafe pointer) types – other types are not cached in CPU registers and
cannot be declared with the volatile keyword. Volatile read and write
semantics are applied automatically when fields are accessed via the Interlocked
class.
If one has a policy always
of accessing fields accessible by multiple threads in a lock statement,
than volatile and Interlocked are unnecessary. |
(Sưu tầm) |