Understanding c++11 memory fences

Your usage does not actually ensure the things you mention in your comments. That is, your usage of fences does not ensure that your assignments to a are visible to other threads or that the value you read from a is ‘up to date.’ This is because, although you seem to have the basic idea of where fences should be used, your code does not actually meet the exact requirements for those fences to “synchronize”.

Here’s a different example that I think demonstrates correct usage better.

#include <iostream>
#include <atomic>
#include <thread>

std::atomic<bool> flag(false);
int a;

void func1()
{
    a = 100;
    atomic_thread_fence(std::memory_order_release);
    flag.store(true, std::memory_order_relaxed);
}

void func2()
{
    while(!flag.load(std::memory_order_relaxed))
        ;

    atomic_thread_fence(std::memory_order_acquire);
    std::cout << a << '\n'; // guaranteed to print 100
}

int main()
{
    std::thread t1 (func1);
    std::thread t2 (func2);

    t1.join(); t2.join();
}

The load and store on the atomic flag do not synchronize, because they both use the relaxed memory ordering. Without the fences this code would be a data race, because we’re performing conflicting operations a non-atomic object in different threads, and without the fences and the synchronization they provide there would be no happens-before relationship between the conflicting operations on a.

However with the fences we do get synchronization because we’ve guaranteed that thread 2 will read the flag written by thread 1 (because we loop until we see that value), and since the atomic write happened after the release fence and the atomic read happens-before the acquire fence, the fences synchronize. (see § 29.8/2 for the specific requirements.)

This synchronization means anything that happens-before the release fence happens-before anything that happens-after the acquire fence. Therefore the non-atomic write to a happens-before the non-atomic read of a.

Things get trickier when you’re writing a variable in a loop, because you might establish a happens-before relation for some particular iteration, but not other iterations, causing a data race.

std::atomic<int> f(0);
int a;

void func1()
{
    for (int i = 0; i<1000000; ++i) {
        a = i;
        atomic_thread_fence(std::memory_order_release);
        f.store(i, std::memory_order_relaxed);
    }
}

void func2()
{
    int prev_value = 0;
    while (prev_value < 1000000) {
        while (true) {
            int new_val = f.load(std::memory_order_relaxed);
            if (prev_val < new_val) {
                prev_val = new_val;
                break;
            }
        }

        atomic_thread_fence(std::memory_order_acquire);
        std::cout << a << '\n';
    }
}

This code still causes the fences to synchronize but does not eliminate data races. For example if f.load() happens to return 10 then we know that a=1,a=2, … a=10 have all happened-before that particular cout<<a, but we don’t know that cout<<a happens-before a=11. Those are conflicting operations on different threads with no happens-before relation; a data race.

Leave a Comment Cancel reply