The C++11 memory ordering parameters for atomic operations specify constraints on the ordering. If you do a store with
std::memory_order_release, and a load from another thread reads the value with
std::memory_order_acquire then subsequent read operations from the second thread will see any values stored to any memory location by the first thread that were prior to the store-release, or a later store to any of those memory locations.
If both the store and subsequent load are
std::memory_order_seq_cst then the relationship between these two threads is the same. You need more threads to see the difference.
y, both initially 0.
int a=x.load(std::memory_order_acquire); // x before y
int c=y.load(std::memory_order_acquire); // y before x
As written, there is no relationship between the stores to
y, so it is quite possible to see
b==0 in thread 3, and
d==0 in thread 4.
If all the memory orderings are changed to
std::memory_order_seq_cst then this enforces an ordering between the stores to
y. Consequently, if thread 3 sees
b==0 then that means the store to
x must be before the store to
y, so if thread 4 sees
c==1, meaning the store to
y has completed, then the store to
x must also have completed, so we must have
In practice, then using
std::memory_order_seq_cst everywhere will add additional overhead to either loads or stores or both, depending on your compiler and processor architecture. e.g. a common technique for x86 processors is to use
XCHG instructions rather than
MOV instructions for
std::memory_order_seq_cst stores, in order to provide the necessary ordering guarantees, whereas for
std::memory_order_release a plain
MOV will suffice. On systems with more relaxed memory architectures the overhead may be greater, since plain loads and stores have fewer guarantees.
Memory ordering is hard. I devoted almost an entire chapter to it in my book.