Chapter 18: Atomic Operations

This final section deals with an issue that comes up with concurrent programming; race conditions, which happens when multiple threads want to edit a common resource. Consider the the following function:

void incr( int* src )
{
	(*src)++;
}

Suppose we have an initial int named value that is initialized to 3. If we call this function twice on that value, we'd expect it to be 5 at the end.

However, suppose instead that we create two threads, each with this function. Notably, when we create threads that execute concurrently, there's no guarantee of the relative order they will execute in. In our case, our function might be decomposed into the following "instructions" (things to do)

1) Get value of (*src)
2) Increment value by 1
3) Store value back to (*src)

(If you're curious about the actual instructions, you can use a compiler explorer here)

If we have two threads executing concurrently, these instructions will be interleaved in some order. Let's consider the following order:

This interleaving would be the same as if we executed them "regularly", without threads, so we will get the correct answer. However, consider the following interleaving instead:

Here, both threads will get the value 3 from (*value). They will both increment it to 4, and both store 4 back to (*value). Because of this, value ends up with a final value of 4, not 5! This is known as a race condition because the outcome depends on how the different threads "race" through their functions. Since we have no guarantee of the order/interleaving of these instructions, we can't guarantee the final value of value! How can we then deal with this scenario?

Atomic Operations

To solve this, C++ provides support for atomic operations. Atomic operations are those which are guaranteed to execute in their entirety without being interrupted. This requires hardware support from our processor/OS, but because of the prevalence concurrent programming has gained, most modern computers have support for this. To implement this in our code, C++ defines the std::atomic class in the <atomic> library file. This class has one member field, a value of some type (the class has static polymorphism, so it can be a variety of types, although int is most common), and can be constructed with it:

#include <atomic>

std::atomic<int> value( 3 );

To support atomic operations on the value, std::atomic overloads a lot of operators!

class atomic<int>
{
	...

	int operator++();        // Prefix increment
 	int operator++( int );   // Postfix increment
	int operator--();        // Prefix decrement
 	int operator--( int );   // Postfix decrement

	int operator+=( int v ); // Adds to value
	int operator-=( int v ); // Subtracts from value
	int operator&=( int v ); // Bitwise AND with value
	int operator|=( int v ); // Bitwise OR with value
	int operator^=( int v ); // Bitwise XOR with value

	...
};

In addition, we also define some fetch member functions, which will become useful later. These perform an operation on the value, but also return the value that was previously held before the operation (all atomically!)

class atomic<int>
{
	...

	int fetch_add( int v ); // Adds argument to value, returns previous value
	int fetch_sub( int v ); // Subtracts argument to value, returns previous value
	int fetch_and( int v ); // Bitwise ANDs argument to value, returns previous value
	int fetch_or ( int v ); // Bitwise ORs argument to value, returns previous value
	int fetch_xor( int v ); // Bitwise XORs argument to value, returns previous value

	...
};

All of these operations are guaranteed to execute atomically. We can see how this can solve our issue from above - we can redefine our incr function from above as

void incr( std::atomic<int>* src )
{
	(*src)++;
}

The increment inside is now guaranteed to execute atomically on the value that std::atomic<int> contains, so if we run this on two threads, we are guaranteed to get the correct value. You could think of this as combining the three steps from above into one single step - no matter the ordering, since they are all one step, we are guaranteed to get the same value at the end. In the diagram below, the blue boxes represent how the steps are combined into one with the atomic operations - no matter the order we execute the blue boxes, the result will be the same:

Locks

Often times, we will want to execute multiple instructions atomically (for example, maybe multiplying a matrix). While executing an operation on a single integer is nice, it can't help us with this scenario - we need to do a much larger operation! Instead, we introduce the concept of locks. Locks are used with critical sections - sections of code that we want to make sure are executed atomically. We acquire the lock going into the section, and release it once we are done with the critical section. If our code can't acquire the lock, it must wait until it can before entering the critical section. In this way, if two functions use the same lock, we can guarantee that only one of them will ever be in the critical section at a time. Because of this, we say that the lock guards the critical section - makes sure that only one function is in a critical section at a time.

void incr( int* src )
{
	< acquire lock >

	(*src)++; // Critical section

	< release lock >
}

Note: This DOESN'T guarantee that other functions won't execute while we're in the critical section, as only code that needs the lock will be prevented from running! However, we can define our critical sections such that running other (non-critical) code won't change the outcome. For instance, if we're multiplying matrices, there might be some setup - we can say that the setup can happen at any time and won't interfere with other functions, but the actual multiplication need to be atomic. This allows, say, Thread 1 to be doing multiplication while still allowing Thread 2 to complete their setup, instead of completely cutting Thread 2 off, allowing us to still get as much benefit as possible from parallelizing. Likewise, if we have critical sections that use different resources, we might have them use different locks, so that they guard against other code trying to use the same resource, but not against code using a different resource.

Implementing Locks

While this concept of locks is still nice, one thing we have to make sure with our implementation is that the process of acquiring locks themselves are atomic (or else we could have multiple threads be in the process of acquiring a lock, which could possibly lead to both being in the critical section at the same time - exactly what we want to avoid!) Releasing locks are still important to consider, but less important - at that point, only one function will have the lock, so only one will be attempting to release it, and it's ok if another piece of code acquires a lock before we're completely done releasing it, as if we're releasing, we're done with the critical section.

Since we want it to be atomic, let's use our int wrapper in std::atomic from before! We can define a lock as simply being one of these std::atomic<int>s - let's say that the lock is acquired if it holds the value 1, and is available if the value is 0:

std::atomic<int> lock( 0 );

Releasing the lock is fairly simple - we can just set the lock to be zero:

void release( std::atomic<int>* lock )
{
	(*lock) = 0;
}

However, as expected, acquiring a lock becomes a bit of an issue. From a high level, we want to continuously check if the lock is acquired, and when it becomes available, acquire it by setting the value to 1. This approach might look something like below (where load gets the value stored):

void acquire( std::atomic<int>* lock )
{
	while( lock->load() == 1 )
	{ } // Wait until it's 0
	
	// Set to 1
	lock++;
}

However, this could bring its own issues with not executing atomically! Consider the following interleaving of two threads in the acquire function, where the lock is initially free:

Both threads see the lock as free, and therefore pass through the while loop. However, both then increment it, and believe that they've acquired the lock, allowing both into the critical section! This is clearly not going to work - we need some way to increment the value as we're accessing it, with both operations executing atomically together.

This is exactly where we can introduce those other functions defined in std::atomic to solve our issue - they can get the previous value stored and edit that value atomically! Consider the revised acquire function below:

void acquire( std::atomic<int>* lock )
{
	while( lock->fetch_or(1) == 1 )
	{ } // Wait until it's 0
}

Here, the fetch_or not only gets the current value, but ORs the value with 1 (essentially setting it to 1). If the lock was already acquired (the previous value was 1), it continues to try this, and the fetch_or has no effect. However, if the lock was free, not only does the fetch_or return 0 (allowing the function to proceed through the acquire function), but it sets the value to 1 at the same time (atomically), preventing any other piece of code from acquiring the lock. With this, we can use these functions to correctly implement what's known as mutual exclusion - where both functions exclude each other from doing something (in this case, operating within the critical section)

void incr( int* src, std::atomic<int>* lock )
{
 	while( lock->fetch_or(1) == 1 )
	{ } // Wait until it's 0

	// Lock acquired

	(*src)++; // Critical section

 	(*lock) = 0; // Release lock
}


int main( void )
{
	std::atomic<int> lock( 0 );
	int value = 3;

	std::thread thread0( &incr, &value, &lock );
	std::thread thread1( &incr, &value, &lock );

	thread0.join();
	thread1.join();

	// value guaranteed to hold 5
}

Mutexs

Finally, remembering to implement all of these acquire/release functions can be a pain sometimes. With C++'s focus on objects, however, we can neatly encapsulate this in an object to do everything for us, called a mutex (short for mutual exclusion). This object can provide a clean wrapper around the std::atomic<int>, and have lock and unlock member functions to implement acquiring and releasing the lock, respectively:

class mutex
{
  public:
    mutex() { lock = 0; }

	void lock()
	{
		while( lock.fetch_or( 1 ) == 1 )
		{ }
	}

	void unlock() { lock = 0; }

  private:
    std::atomic<int> lock;
}

With this, we can abstract away the locking and unlocking process, as well as the implementation, leading to much cleaner code:

std::mutex incr_mutex;

void incr( int* src )
{
 	incr_mutex.lock();

	(*src)++; // Critical section

 	incr_mutex.unlock();
}


int main( void )
{
	int value = 3;

	std::thread thread0( &incr, &value );
	std::thread thread1( &incr, &value );

	thread0.join();
	thread1.join();

	// value guaranteed to hold 5
}

Since this is so often used in concurrent programming, C++ actually provides us with this in the Standard Library (as std::mutex, in the <mutex> library file)

Great - you now know all you need to to implement concurrency across any type of program, whether resources are shared or not! Let's get one last bit of practice with this

Practice

If you haven't already, clone the practice repository by opening a terminal in your desired directory and typing

git clone git@github.com:cornell-c2s2/c_cxx_training.git

Once inside this directory, we'll switch over to the code for this chapter by checking out the appropriate branch:

git checkout chpt18

Due to your recent success over your coworker, your company is now considering you for a promotion! However, to prove that you're worth it, you must demonstrate their skills one final time. Your company's finance department is having some difficulties with their budget tracking software. They've implemented a spend_money function (in the src folder, defined in spend_money.h, implemented in spend_money.cc) which their associates use to change their budget when a purchase is made. These associates are working simultaneously around the world, so your department has simulated these calls in a testing script named budget_testing.cc (in the app folder). You can look at this code to see what it's doing, and build it right away by opening a terminal in the main practice directory and typing

make budget_testing

Our built file should now be in the build directory. We can now run our code with

build/budget_testing # MacOS/Linux
build\budget_testing # Windows

You'll quickly see that the outputs and the end result are all gibberish! The finance department has no idea what's going on, but you as a programmer have a hunch that, when concurrency was introduced, there may be a race condition! You're tasked with fixing this, but the finance department wants to keep as much existing code as possible. Because of this, you are not allowed to edit budget_testing.cc or spend_money.h, and can not insert code into the code already written for the spend_money function in spend_money.cc. Can you operate within these restrictions to get rid of the race condition?

Once you've done so, you can re-build and re-run your code. You should get nicely structured output and a final budget of $575,000. Your finance department is super impressed, and you get the promotion - excellent job!!

Final Notes

This concludes the C/C++ Training! You should hopefully now feel comfortable working within C and C++, and should understand all of the basic concepts, as well as some advanced functionality, including generic programming with dynamic and static polymorphism, functional programming with functors and lambdas, and concurrent programming with threads and atomic operations. Throughout this training, you have understood thousands of lines of code, and have certainly written a fair portion yourself.

While I aimed to cover all of the core functionality of the languages, there is some that I chose to omit for concision. However, you have all of the tools you need to understand any challenge you may face. C++ has many external libraries that you can install, with their own header files, that can achieve many advanced functionality. These may use templates, pointers/references, and function objects, but you have all the knowledge you need to use them.

If you have any questions, feel free to reach out to me (Aidan Connor McNay) - I'm always happy to answer questions! I hope you've enjoyed learning; it was certainly a joy being able to teach you

Page tree