Memory Barriers: a Hardware View for Software Hackers

On in Bookmark by Mingxing Zhang
Tags: ,

URL: http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.07.23a.pdf

This paper gives a clear explanation on techniques for increasing cache utilization, and justify the existence of memory barriers as a necessary evil that is required to enable good performance and scalability.

Its general structure is as follows:

  1. Presents the structure of a cache;
  2. Explains how cache-coherency protocols ensure that different per-CPU caches coordinate with each other;
  3. Describes a technique called "store buffer", which can be used to ease the performance loss caused by invalidate-acknowledgement message passing.
  4. Gives an example on why write memory barriers are needed -- Store buffers will reorder the execution of instructions to achieve better performance but we need methods to ensure some critical orders will not be undermined;
  5. Outlines another technique named "invalidate queue" for making invalidate-acknowledgement messages arrive more quickly.
  6. Gives a corresponding example on why read memory barriers are needed -- Invalidate queues will cause another kind of reordering which can be prevented by read memory barriers.

The paper also gives many quizzes and discussions on real implementations (e.g. ARM, IA64).