The problem. Formal verification is a promising alternative to software testing, the current method of choice for software quality
assurance. Unlike testing---which does not scale because new test cases must be created as the code grows---verification can check the
same correctness property (e.g., whether each acquired lock is eventually released) against the entire code. Even more importantly,
verification examines all execution paths of the program, providing a guarantee that the program is entirely free of the
specified type of error.
Recent research results promise that practical verifiers will be soon available. These tools, however, will be of little use without enough
correctness specifications to be verified. Where these specifications will come from is an open question: While it may be economical to
formulate broadly applicable language-specific properties (e.g., a dereferenced pointer must not be null), complex properties that are
highly specific to a single program (e.g., how to rotate a pink-purple tree) may never be formalized, partly due to programmers'
reluctance to embrace formal methods, and partly because there are too many properties to be specified. Consequently, we expect that
software verification will be widely adopted only if we develop techniques that will reduce the cost of formulating specifications.
Our approach. To supply future verifiers with useful non-trivial properties, we are developing a (semi-)automatic machine-learning
approach to discovering formal specifications of the protocols that code must obey when interacting with an application program interface
(API) or abstract data type (ADT). Ours is the first system capable of learning rather general protocols such as ``A socket
x may be read only after it was accept-ed from a socket
y that had been first bind-ed and then
listen-ed.'' In general, our specifications encode legal partial orders of API calls, together
with data-flow constraints on their arguments, encoded as (potentially cyclic) state machines.
Our miner follows a (probabilistic) maxim that ``Common behavior is correct behavior.'' Starting from the assumption that a working
program is debugged well enough to reveal strong hints of correct protocols, it infers a specification by observing dynamic program
execution and concisely summarizing the frequent API interaction patterns as state machines. After these state machines are examined
and refined by an expert, they can be utilized by automatic
verification tools, to find bugs.
Specification mining offers several unique benefits:
Specification mining by reduction to learning regular languages. Specification mining is not
data mining; they learn different properties, and require different algorithms. Our
POPL'02 paper formulated the specification mining problem,
and developed an algorithm based on a interesting reduction to the
simpler problem of learning regular languages. To be robust in the presence of bugs in the training set, the algorithm follows two design
decisions: First, to attack the problem that bugs might be learned into the resulting specification, it learns not from the source code
itself, but from traces of executions, which contain fewer bugs.
Second, to take a ``vote'' on what the common program behavior is, the miner learns a
probabilistic regular language, whose high-probability core encodes the frequently followed protocol. The
approach promises to be robust: even a poorly tuned miner prototype was able to learn specifications that not only captured the correct
protocol, but also discovered serious bugs.