Hardware-Based Solutions for Process Synchronization
When it comes to keeping processes in sync, you can either handle it through software algorithms or let the hardware do the heavy lifting. This lesson is about the hardware side of things.
The core idea behind hardware-based synchronization is atomic instructions. These are special machine-level operations that, once started, cannot be interrupted or paused midway. No other process or core can sneak in while one of these instructions is running. That guarantee of indivisibility is what makes them useful for synchronization in the first place.
Test and Set
Test and Set is about as simple as it gets for hardware synchronization. It reads the current value of a memory location and immediately sets it to 1, all in one uninterruptible step. Whatever the original value was gets returned so the caller knows whether the lock was free (0) or already taken (1).
Here is the assembly-level representation:
TEST_AND_SET:
LOAD R1, M; // Load the value of memory location M into register R1
MOV R2, #1; // Set R2 to 1 (indicating lock is acquired)
STORE R2, M; // Store the value of R2 back to memory location M
RETURN R1; // Return the original value of MAnd this is the equivalent logic expressed in C++:
bool test_and_set(bool *lock) {
bool old = *lock; // Read the current value
*lock = true; // Set it to true (locked)
return old; // Return the old value
}If the returned value is false, the lock was free and the caller has now acquired it. If the returned value is true, someone else already holds the lock and the caller must try again.
The critical detail is that reading the old value and writing the new value happen as a single indivisible operation at the hardware level. There is no window between the read and the write where another core could interfere.
Compare and Swap
Compare and Swap (CAS) takes a slightly more cautious approach than Test and Set. Instead of blindly setting a value, it first checks whether the current value matches what you expect. If it does, it swaps in the new value. If it does not, it leaves things as they are and just returns the current value so the caller knows something changed.
Here is the assembly:
COMPARE_AND_SWAP:
LOAD R1, M; // Load the value of memory location M into R1
CMP R1, R2; // Compare R1 with the expected value in R2
BEQ SWAP; // If equal, branch to SWAP
RETURN R1; // Otherwise, return the original value of M
SWAP:
STORE R3, M; // Store the new value in R3 to memory location M
RETURN R1; // Return the original value of MAnd the C++ version:
int compare_and_swap(int *ptr, int expected, int new_value) {
int old = *ptr;
if (old == expected) {
*ptr = new_value; // Swap only if current == expected
}
return old; // Always return the original value
}CAS is the foundation of most modern lock-free data structures. Languages like Java expose it directly through the AtomicInteger class, and C++ provides it via std::atomic. It is considered more flexible than Test and Set because it can be used for more than just locks.
Test and Set vs Compare and Swap
| Feature | Test and Set | Compare and Swap |
|---|---|---|
| Operation | Reads the old value and unconditionally sets it to 1 | Reads the old value and conditionally swaps only if it matches the expected value |
| Parameters | Takes 1 parameter (the lock address) | Takes 3 parameters (address, expected value, new value) |
| Flexibility | Limited to binary lock/unlock scenarios | Can implement counters, queues, and lock-free algorithms |
| Modern Usage | Mostly used in simple spinlocks | Foundation of lock-free programming in Java, C++, and Go |
Spinlocks
Spinlocks are built on top of these atomic instructions. The way they work is straightforward: if a thread tries to acquire a lock that is already taken, it just keeps looping and checking until the lock becomes free. It does not sleep or yield. It just sits there spinning.
Here is a basic spinlock implementation using Test and Set:
class Spinlock {
volatile bool lock = false;
public:
void acquire() {
while (test_and_set(&lock)) {
// Busy wait: keep spinning until lock is free
}
}
void release() {
lock = false;
}
};The acquire() method keeps calling test_and_set() in a loop until it gets back a false, which means the lock was free and has now been taken. The release() method simply sets the lock back to false so the next thread in line can grab it.
The Underlying Principle
All three of these mechanisms, Test and Set, Compare and Swap, and Spinlocks, rely on the same underlying principle: using hardware guarantees to make certain operations indivisible. That indivisibility is what allows processes running on different cores to coordinate without corrupting shared data.
Without these hardware primitives, every software-based synchronization algorithm (Peterson's, Dekker's, Bakery) would need to make much stronger assumptions about memory ordering and instruction interleaving, making them fragile and often impractical on modern multi-core processors.
Hardware Sync Check
Question 1 of 1Test your understanding of atomic hardware instructions.
