|
Current multiprocessor systems execute parallel and concurrent
software nondeterministically: even when given precisely the same
input, two executions of the same program may produce different
output. This severely complicates debugging, testing, and automatic
replication for fault-tolerance. Previous efforts to address this
issue have focused primarily on record and replay, but making
execution actually deterministic would address the problem at the
root.
Our goals in this work are twofold: (1) to provide fully
deterministic execution of arbitrary, unmodified, multithreaded
programs as an OS service; and (2) to make all sources of
intentional nondeterminism, such as network I/O, be explicit and
controllable. To this end we propose a new OS abstraction, the
Deterministic Process Group (DPG). All communication between
threads and processes internal to a DPG happens
deterministically, including implicit communication via
shared-memory accesses, as well as communication via OS channels
such as pipes, signals, and the filesystem. To deal with
fundamentally nondeterministic external events, our
abstraction includes the shim layer, a programmable interface
that interposes on all interaction between a DPG and the external
world, making determinism useful even for reactive applications.
We implemented the DPG abstraction as an extension to Linux and
demonstrate its benefits with three uses: plain deterministic
execution; replicated execution; and record and replay by logging
just external input. We evaluated our implementation on both parallel
and reactive workloads, including Apache, Chromium, and PARSEC.
|