Anticipating Invariant

On in Pages by Mingxing Zhang

AI in Brief

Concurrency bugs (CBugs) are notoriously difficult to be eradicated in the testing phase because of their non-deterministic nature, and the bug fixing procedure is also time-consuming and error-prone.

Thus, tolerating concurrency bugs in the production phase emerges as an attractive complementary approach. But unfortunately, the existing tolerating tools are usually 1) constrained in types of bugs they can handle; or 2) requiring roll-back mechanism, which can hitherto not be fully achieved efficiently without hardware supports.

In contrast, the Anticipating Invariant (AI) can anticipate CBugs before any irreversible changes have been made. Based on it, we implemented a software-only tool to tolerate concurrency bugs on-the-fly.

The tool will restrict the program's interleaving space, such that it avoids AI-violating (i.e., potentially failure-triggering) interleavings during the production runs. Since AI can detect the bugs beforehand, we are able to bypass the suspicious interleavings through stalling, instead of resorting to roll-back.

Experiments with 35 real-world concurrency bugs demonstrate that AI is capable of detecting and tolerating most types of concurrency bugs, including both atomicity and order violations.

We also evaluate AI with 6 representative parallel programs. Results show that AI incurs negligible overhead (< 1%) for many nontrivial desktop and server applications. And its slowdown on computation-intensive programs can be reduced to about 2x after using the bias instrumentation.

To the best of our knowledge, this is the first attempt to efficiently tolerate previously unknown order and atomicity violations at run time without using rollback.

Paper

Won SIGSOFT Distinguished Paper Award

Software

You can download and try AI at here.

In the package, we present:

  1. the source code of our LLVM-based AI implementation;
  2. several demos for demonstrating AI's ability of tolerating CBugs;
  3. applications from different categories (desktop, server, HPC) for evaluating AI's overhead;
  4. an example of the APIs' usage.

Documentations, screencasts and some auxiliary scripts are also provided.