Network Semantics for Verifying Distributed Systems

Distributed systems are hard to implement correctly. Because of the interactions between many potentially failing components, it is essentially impossible to perform exhaustive testing on any real-world implementation. Most known techniques for checking concurrent and distributed algorithms work only on models of the system, which results in a formality gap between the model and its implementation. For example, classical model checking has been shown to be extremely effective in finding concurrency and fault-tolerance bugs at the design level, but it does not scale to the complex data structures used in real systems. Formal verification with human-guided, semi-automated, machine-checked proofs is the only technique that can provide the desired guarantees for such systems.

Verdi is a Coq-based framework for implementing and verifying distributed systems. At the core of Verdi are a set of network semantics. Each network semantics encodes a set of assumptions about what kinds of things can happen to a distributed system. Network semantics can range from the idealistic to the pessimistic. For example, one might assume that all messages are eventually delivered and that nodes never fail. On the other hand, one might assume that packets can be dropped and duplicated and that some nodes behave arbitrarily or maliciously. Different systems are designed under different sets of assumptions, and network semantics capture those assumptions.¹

The main point of defining a network semantics in Verdi is to verify distributed systems against that semantics. After verifying a safety property \(\Phi\), Verdi enables programmers to extract executable code and run their systems on real networks.² Assuming the network semantics correctly describes all possible behaviors of the system's environment, Verdi guarantees that \(\Phi\) holds on all executions of the system.³

The rest of this post describes Verdi's network semantics by developing a slightly simplified version of them in Coq. For the real deal, see the file Net.v on github.

Assumptions common to all network semantics

Verdi systems consist of a set of nodes operating in an asynchronous network environment. Each node keeps some local state and can exchange messages with other nodes. Messages can be arbitrarily delayed and reordered.⁴ Nodes also interact with the external environment through input and output. Verdi systems are written as a set of event handlers. For example, when a message is delivered to a node, that node's message delivery event handler is run. An event handler can read and write the local state of the node, send messages to other nodes, and send output to the external environment. A special handler is run at the beginning of time to initialize the each node's state.

Verdi's network semantics are defined by describing the possible "steps" that a system can take during execution. For example, one step might be that a message in the network can be delivered to its destination and the message delivery event handler run. The possible steps are defined formally as a relation on "worlds". A world is snapshot of a system's execution, and consists of the local state of every node of the system, the set of packets that are in flight, and the trace of external events.

Here we declare five variables that describe the system and its interaction with the environment. First, node describes the set of nodes (or node names, if you prefer) in the system. For example, this might be some set consisting of high level names such as "Database" and "Client". Second, state is the type of the local state that each node maintains. For example, in an in-memory key-value store, this might consist of some sort of map data structure.⁵ Third, msg describes the messages sent between nodes. For example, in a key-value store this might consist of read and write requests intended to be sent from the client to the database, as well as responses and acknowledgments, intended to be sent in the other direction. Fourth, input describes the events that nodes respond to from the external environment. For example, input can be used to describe requests that come from clients whose code we did not write or do not want to verify. Finally, output describes the events that nodes send to the external environment. For example, output can be used to respond to requests that were made via input.

An externalEvent is either an input or an output, together with the node at which the event occurs. A packet contains a message and a destination.⁶ The world consists of a function from nodes to their local state and a list of in flight messages together with their destination. Although we use a list of packets here, we are careful to treat it as a multiset; that is, the order of its elements does not matter. Future versions of Verdi may use a more abstract multiset type here instead. Finally, the world also contains trace, which is a list of externalEvents. The trace records the history of all external events that any node has ever sent or received; it allows us to reason about the external behavior of the system over time.

When a node starts, the initState computes its initial state. When a message is delivered, the processMsg handler is run. Its first argument is the node on which the handler is running (that is, the destination of the message). Its second argument is the message that was delivered. The third argument is the current value of the node's local state. The handler returns the new value of the local state, a list of packets to send, and a list of outputs to produce. The processInput handler is very similar, except it takes an input instead of a msg.⁷

The reliable semantics

It isn't too hard to hook up all these parts into a semantics that delivers messages from the network and allows arbitrary input from the external environment. We call this semantics the reliable semantics because no messages are dropped or duplicated, and no nodes ever fail.

The reliable semantics is implemented as the relation reliable_step, which is defined in terms of two rules. The first rule delivers an arbitrary input i to an arbitrary node n by running its processInput handler, passing in i and the current local state. The results of processInput are used to build a new world where the local state is updated, new messages are added to the in flight list, and the trace is updated to record the input and any outputs.

The second rule delivers a packet p from the network by running the processMsg at dest p with the p's payload message and the current local state as arguments. The results are used to update the world, similar to before, except that one copy of p is also removed from the network.

Other network semantics

Small-step operational semantics, like we've just seen in reliable_step, are a powerful and flexible way of modeling the behavior of systems. As an exercise, the reader (or at least, the reader who hasn't read our paper...) should try to design rules for dropping and duplicating packets, being careful not to depend on the order of the list of packets. Verdi also includes semantics modeling node failure, but it requires changing the world type to keep track of the nodes that have failed. For more details, see the file Net.v in the Verdi implementation.

What's next

In our next post, we'll show how to actually build and verify a system in Verdi using the reliable network semantics. Future posts will also discuss Verdi's verified systems transformers, which provably hide faults, so that application programmers don't have to reason about them.

The literature often uses the term "fault model" to mean the set of assumptions that a system makes about its environment. We use the term network semantics because our fault models are written down formally as operational semantics. We consider "fault model" to be a more general term, encompassing all levels of formality.↩
Verdi currently focuses on safety properties, and we leave liveness properties to future work.↩
This also assumes the correctness of Verdi's trusted computing base (TCB), which includes: the soundness of Coq's logic, the correctness of Coq's proof checker, the correctness of Verdi's shim, and the correctness of OCaml's compiler and runtime, etc.↩
There is nothing preventing Verdi from including other network semantics that make different assumptions. For example, we recently added an ordered network semantics. One could also imagine a synchronous message passing model, where all nodes exchange messages in rounds. Future versions of Verdi may include such models. (Pull requests welcome!!!)↩
Verdi's implementation currently restricts every node to maintain the same type of state. It is possible to allow each node to maintain a different type of state. Future versions of Verdi will almost certainly support this.↩
...though this isn't ideal terminology, since there are none of the low-level network details that one might expect "packet" to connote.↩
If you are familiar with monads, the last argument and return type of the message and input handlers can be understood as a State monad over the local state, together with a Writer monad for sending packets. Verdi supports writing handlers in monadic style. See the file HandlerMonad.v on github.↩

Network Semantics for Verifying Distributed Systems

James Wilcox

April 16, 2015

Assumptions common to all network semantics

The reliable semantics

Other network semantics

What's next