Spring '05

Debunking Dynamic Optimization Myths

Location: 310 Soda
Date: Mar 28
Time: 4-5pm
Seminar Speaker Name: Matt Arnold, IBM Watson

Programming languages that are executed by virtual machines face significant performance challenges beyond those confronted by traditional languages. First, portable program representations and dynamic language features force the deferral of most optimizations until runtime, inducing runtime optimization overhead. Second, modular program representations preclude many forms of whole-program interprocedural optimization. Third, virtual machines incur additional costs for runtime services such as security guarantees and automatic memory management. To address these challenges, mainstream virtual machine implementations include substantial infrastructure for online profiling, dynamic compilation, and feedback-directed optimization. This talk will survey the state-of-the-art in the areas of dynamic compilation and adaptive optimization in virtual machines by debunking several misconceptions about these two topics.

This talk is based on PLDI'04 tutorial created by Stephen Fink, David Grove, and Michael Hind.  The complete slides from this 3-hr tutorial are available at Michael Hind's web page.


 

xtc: eXTensible C

Location: 310 Soda
Date: April 4
Time: 4-5pm
Seminar Speaker Name: Robert Grimm, NYU

Operating and distributed systems can significantly benefit from language and compiler support, even if they are written in C or C++. Examples include libasync for building event-driven distributed services, Capriccio for building scalable multi-threaded servers, and MACEDON for building overlay networks. Unfortunately, existing solutions for extending C-like languages are either not expressive enough or targeted at compiler experts, thus preventing other system builders from reaping similar benefits. To address this problem, we are exploring a new macro facility for C, called xtc for eXTensible C, which will allow system builders to easily realize their application-specific extensions. In this talk, I motivate our work, provide an overview of xtc's design, and describe our first step towards realizing this design, a toolkit for building extensible source-to-source transformers.

Fall’04


From Erasure to Wildcards: Adding Generics to the Java(TM)
Programming Language

Location: 310 Soda
Date: Dec 6
Time: 4-5pm
Seminar Speaker Name: Joseph D. Darcy and Peter von der Ahé, Sun.

The problem of how to add some form of generics to the Java programming language received a great deal of attention from the programming language community and generated numerous proposals. The recently shipped JDK 5 supports generics both in the language and in retrofitted library classes. Based on the GJ proposal, the current form of generics also supports
wildcards; wildcards are a recent developments in type theory related to existential types. This talk will summarize all the language changes in JDK 5, give and overview of the design, implementation, and use of simple generics as well as discuss the wildcard mechanism.

 

Bios:


Receiving a master's from Berkeley in 1998 for "Borneo: Adding IEEE 754 floating-point support to Java," Joe has worked in the JDK group on numerical issues and in JDK 5 also led the effort designing and implementing apt, the annotation processing tool.

 

Before receiving a Master's from University of Aarhus, Denmark, in 2004 for "Applications of Concrete-Type Inference," Peter worked on various projects including a proposal to extend the preexisting proposal for adding generics to JDK 5 with use-site variance. This work was incorporated in JDK 5 as the feature now known as "wildcards". Peter now maintains javac as part of the JDK group.


Building Introspective Software

Location: 310 Soda
Date: Nov 1
Time: 4-5pm
Seminar Speaker Name: Trishul Chilimbi, MS Research

Software systems and execution environments continue to grow in size, complexity, and diversity, making it harder to deliver purely static solutions for improving software quality. Our pragmatic approach is to continuously self-monitor application execution behavior and report encountered problems. We use the term introspective software for programs that continuously monitor and analyze their execution at a fine granularity. A key requirement for usable introspective software is a ubiquitous profiling infrastructure with sufficiently low-overhead to have negligible impact on application performance. We describe a sampling-based framework that meets these requirements, yet is able to generate fine-grain temporal execution information that can be used to improve application performance and reliability. With this, we have achieved performance speedups on large, real-world software of up to 2x and discovered errors that often would be impossible to find with other techniques.

 

Bio:

Trishul Chilimbi is a researcher at Microsoft Research, where he leads the Runtime Analysis and Design research group. His research interests include programming languages, compilers, runtime systems, and computer architecture. He has worked extensively on improving memory system performance and has founded an ACM workshop on this topic. He currently works on applying runtime technologies to improve software performance, reliability, and security.  He received his Ph.D. from the University of Wisconsin-Madison. More information is available at his web page (http://research.microsoft.com/~trishulc).

 


Under- and over-approximation analyses using powersets and logical relations

Location: 310 Soda
Date: Oct 25
Time: 4-5pm
Seminar Speaker Name: David Schmidt, Kansas State University

Almost all static analyses are overapproximating (safety) analyses, which overestimate a program's behavior in exchange for a guarantee of termination.

Overapproximating analyses can be defined in terms of Galois connections, that is, a pair of functions, $\alpha, \gamma$, where $\alpha$ defines how to map an input test-data set into an abstract value for static analysis, and $\gamma$ shows how to map the result of the analysis back to the overestimated set of outputs.

Underapproximating (liveness) analyses have gained in importance, and it is natural to try to dualize the Galois connection format, but the obvious dualization generates analyses that calculate no useful information.

This talk reviews overapproximation analysis, examines the underapproximation problem, and shows how the Galois connection framework can be applied when it is used with lower and upper powerdomain constructors. Dennis Dams's under- and overapproximation analyses of state-transition systenms for model checking are used as a case study.

If time allows, the role of logical relations in lifting the Galois connections to higher type will be explained.



TVLA: Three-Valued-Logic Analyzer

Location: 310 Soda
Date: Sep 20
Time: 4-5pm
Seminar Speaker Name: Mooly Sagiv, Tel-Aviv University

Abstract: TVLA (Three-Valued-Logic Analyzer) is a ``YACC''-like framework for automatically constructing abstract interpreters from an operational semantics. The operational semantics is specified as a generic transition system based on first-order logic. TVLA was implemented in Java and successfully used to prove interesting properties of (concurrent) Java programs manipulating dynamically allocated linked data structures.

 


Formal Methods in Practice: A Tutorial Introduction to ACL2

Prof. J. Strother Moore, University of Texas, Austin

Tuesday, October 5 2004, 12:30-2pm, 310 Soda

 

In this talk I will describe some of the industrial applications of ACL2 and demonstrate the ACL2 theorem prover.  ACL2 stands for "A Computational Logic for Applicative Common Lisp"; ACL2 consists of a functional programming language ("Pure Common Lisp"), a first-order mathematical logic, and a mechanical theorem proving system in the Boyer-Moore tradition, written by Matt Kaufmann and Moore.

 

ACL2 has been used in industry, for example

 

·        to verify the RTL used to fab the elementary floating point  operations on the AMD Athlon, Opteron, and other microprocessors

·        to verify the correspondence between bit- and cycle-accurate models of a Motorola DSP at the micro-architectural and microcode engine level

·        to verify a microcoded separation kernel for the Rockwell-Collins Avionics AAMP7 processor

·        to verify properties of JVM bytecode

·        to verify properties of the JVM bytecode verifier and class loader

 

In this talk, I will describe some of these examples more closely and then use elementary examples to illustrate some of ACL2's capabilities, including rewrite rules, congruence, linear arithmetic, and simple modeling of digital artifacts.

 

Toward the end of the class period the talk may dive more deeply into theorem proving issues of interest only to a few members of the audience.  Visitors are welcome to leave when that happens!


Spring’04


A Counter-Example to  Lord J.M. Keyne's  Aphorism

"What's not worth doing isn't worth doing well."

Prof. W. Kahan, UCB

Monday, May 17 2004, 4-5pm, 320 Soda

 

Abstract for Compiler Writers:

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Almost no  American Computer Science  degrees require exposure to a competent treatment of floating-point arithmetic.  Numerical naivety,

    combined with the impracticality of debugging complicated floating-point codes by means suitable for other kinds of programs,  inflates the costs and risks incurred with numerical software everywhere but in video games.  A case study illustrates exacerbations of costs and risks by a lack,  in all the most widely used programming languages, of support for the widest precision built into by far the majority of  microprocessors' hardware.  Adequately accurate and fast software for Lambert's  W-functions  W(y) and M(y),  roots  w  of a simple equation w exp(w) = y ,  entangle the software developer in intense and error-

    prone error-analyses that would not be necessary if he had convenient access to the hardware's higher precision arithmetic.  Such access is part of reforms of programming languages urgently required to enhance

    software reliability despite clever programmers' increasing ignorance.

 

 

    Abstract for Numerical Analysts:

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    The equation  w exp(w) = y  is solved for real roots  w  as functions of  y  in floating-point arithmetic.  The roots have two branches:  w = W(y) > -1  for y > -1/e  and  w = M(y) < -1  for  -1/e < y < 0 . They are computed in just two steps of an iteration that converges at fourth order except at the branch-point  y = -1/e  where convergence degrades to quadratic.  This is novel;  most iterations degrade to linear convergence at a double root.  Also novel are the starting iterates which improve upon asymptotic estimates known previously. Extensive error-analyses ensure unusual accuracy:  well within  3 units in the last place of the arithmetic's precision up to  125 sig. bits.  Two  MATLAB  programs work in  IEEE Standard 754 Binary  with 53 sig. bits.  But these programs and error-analyses could have been a lot simpler if the most popular programming languages supported the most popular computers' floating-point hardware more nearly fully

 


Automatic Pool Allocation: Compile-Time Control over

        Complete Pointer-Based Data Structures

Vikram Adve, UIUC

Monday, May 10 2004, 4-5pm, 320 Soda

Traditional compiler techniques for pointer-intensive programs focus on disambiguating, reordering, or optimizing individual loads and stores, data elements, and data types.  We describe an alternative approach we call "Macroscopic Data Structure Transformations," which are program analyses and transformations that operate at the level of _entire_ recursive data structures. The foundations of this approach are

(1) Data Structure Analysis (DSA): a fast, context-sensitive analysis with some novel features, which extracts key properties of data  structure instances; and

(2) Automatic Pool Allocation: a new program transformation that gives the compiler partial control over data structure layout in the heap.

Automatic Pool Allocation itself improves performance significantly for heap-intensive programs but, more importantly, gives the compiler data layout information that enables other analyses and transformations that were not possible before.  We describe several novel examples of "macroscopic" optimization techniques enabled by Automatic Pool Allocation.  We have also used DSA and Pool Allocation to enforce memory safety without garbage collection in imperative programs, and we are building a secure, language-independent programming platform called SAFECode based on these techniques.  These diverse applications provide evidence that the macroscopic data structure approach opens up a broad new class of compiler techniques for pointer-intensive programs.

 

BIO:

    Vikram Adve is an Assistant Professor of Computer Science at the University of Illinois at Urbana-Champaign.  His research interests include compilers, computer architecture, and performance evaluation.

His research group has developed the LLVM Compiler Infrastructure, a novel, publicly distributed, compiler framework for "lifelong" optimization of programs in arbitrary languages.  The group is using LLVM for several broad research projects, including macroscopic data structure transformations, static analyses for enforcing program safety, and a new class of processor architectures we call Virtual Instruction Set Computers (VISC), which define a rich virtual instruction set implemented via a truly cooperative hardware/software microarchitecture.

    Adve received a B.Tech. from I.I.T. Bombay in 1987, a Ph.D. in Computer Science from the University of Wisconsin in 1993, and was a Research Scientist at Rice University before joining the University of Illinois.  He has received the NSF CAREER award, the Best Paper Award at the 2001 Workshop on Parallel and Distributed Simulation, and the UIUC Computer Science Department's Outstanding Junior Faculty Award.

He is an Associate Editor of the ACM Transactions on Programming Languages and Systems (TOPLAS).

 

 


Using Hardware Performance Monitors to Understand the Behavior of Java Applications

Peter F Sweeney, IBM Watson

Monday, May 5 2004, 4-5pm, 373 Soda

 

 

Modern Java programs, such as middleware and application servers, include many complex software components.   Improving the performance of these Java applications requires a better understanding of the interactions between the application, virtual machine, operating system, and architecture.  Hardware performance monitors, which are available on most modern processors, provide facilities to obtain detailed performance measurements of long-running applications in real time.  However, interpreting the data collected using hardware performance monitors is difficult because of the low-level nature of the data.

 

We have developed a system, consisting of two components, to alleviate the difficulty of interpreting results obtained using hardware performance monitors.  The first component is an enhanced VM that generates traces of hardware performance monitor values while executing Java programs.  This enhanced VM generates a separate trace for each Java thread and CPU combination and thus provides accurate results in a multithreaded and multiprocessor environment. The second component is a tool that allows users to interactively explore the traces using a graphical interface.  We implemented our tools in the context of Jikes RVM, an open source Java VM, and evaluated it on a POWER4 multiprocessor.  We demonstrate that our system is effective in uncovering as yet unknown performance characteristics and is a first step in exploring the reasons behind observed behavior of a Java program.


A (Grand?) Unified Theory of Storage Reclamation

David F. Bacon

IBM T.J. Watson Research Center

Monday, March 1, 2004, 4pm, 320 Soda Hall

 

The two basic forms of automatic storage reclamation, tracing and reference counting, were invented at the dawn of the high-level language era over 40 years ago.  Since then there have been many improvements and optimizations, but all systems are based on one or the other of these methods, which are uniformly viewed as being fundamentally different and posessing very distinct performance properties.  We have implemented high-performance collectors of both types, and in the process observed that the more we optimized them, the more similarly they behaved -- that they seem to share some deep structure.

 

We present a formulation of the two algorithms that shows that they are in fact duals of each other.  Intuitively, the difference is that tracing operates on live objects, or "matter", while reference counting operates on dead objects, or "anti-matter".  For every operation by the tracing collector, there is a precisely corresponding anti-operation by the reference counting collector.

 

Once viewed in this light, we show that all high-performance collectors are in fact hybrids of tracing and reference counting (for example, deferred reference counting and generational collection).  We are developing a uniform cost-model for the collectors to quantify the tradeoffs that result from choosing different hybridizations of tracing and reference counting.  This will allow the correct scheme to be selected based on system performance requirements and the expected properties of the target application.

 

(joint work with Perry Cheng and V.T. Rajan)

 


Cooperative Bug Isolation

Benjamin Liblit

UC Berkeley

Monday, Feb. 9, 2004, 4pm

 

The resources available for improving software are always limited, and through sheer numbers a program's user community will uncover many flaws not captured during the program's development.  We propose a low-overhead sampling infrastructure for gathering information from the runs experienced by a program's user community.  Monitoring is both sparse and random, so complete information is never available about any single run.  However, the instrumentation strategy is also lightweight and therefore practical to deploy to user communities numbering in the thousands or millions.  Additionally we have developed a body of techniques, called "statistical debugging," for isolating the causes of bugs using this sparsely sampled data. Statistical debugging combines statistical modeling, machine learning, and program analysis to identify those program behaviors which are highly predictive of program failure.

  We will also briefly consider other potential applications of sampling deployed software.

 


Fall’03

Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs

Stephen Freund

Williams College

Monday, Dec. 15, 2003, 4pm

 

Ensuring the correctness of multithreaded programs is difficult, due to the potential for unexpected interactions between concurrent threads.

Much previous work has focused on detecting race conditions, but the absence of race conditions does not by itself prevent undesired thread interactions.

We focus on the more fundamental non-interference property of atomicity; a method is atomic if its execution is not affected by and does not interfere with concurrently-executing threads. Atomic methods can be understood according to their sequential semantics, which significantly simplifies reasoning about program correctness.

 

This talk presents a dynamic analysis for detecting atomicity violations. This analysis combines ideas from both Lipton's theory of reduction and earlier dynamic race detectors. Experience with a prototype checker for multithreaded Java code demonstrates that this approach is effective for detecting errors due to unintended interactions between threads. In particular, our atomicity checker detects errors that would be missed by standard race detectors, and it produces fewer false alarms on benign races that do not cause atomicity violations.

 

This is joint work with Cormac Flanagan (UC Santa Cruz).


A Type System Equivalent to a Model Checker

Jens Palsberg

UCLA

Tuesday, December 2, 2003

 

Type systems and model checking are two prevalent  approaches  to finding   bugs   in   software.    Are   they  equally  powerful? Superficially, they seem quite different.  Type checking works on the  program  itself, typically in a modular fashion, while model checking usually works on a model of  the  whole  program.   Type soundness  ensures  that  all  reachable program states satisfy a certain property provided the program is well typed while  model- checking  soundness  guarantees  the same without any assumptions about types.  Thus, model checking seems to be more powerful than type  checking:   Type  soundness  essentially  says  that  type checking implies model checking, but there  are  not  many  cases where the converse holds.

 

In this paper, we present a type system that is equivalent  to  a model  checker  for  a notable resource-bound analysis problem in real-time systems called deadline  analysis.   The  type  system, which  is value-sensitive, flow-sensitive, and context-sensitive, type checks exactly those programs that are accepted by the model checker.  Our result sheds light on the relationship between type systems and model checking.  It is a first step  towards  proving more   general   equivalences   that   could  provide  a  clearer understanding  of  the  benefits  and  tradeoffs   of   the   two approaches.   Moreover, it shows promise in devising type systems for checking broad classes of resource-usage properties that  are routinely checked using model checking.

 

Joint work with Mayur Naik.

 

About the speaker: Jens  Palsberg  is  a  Professor  of  Computer Science  at  UCLA.   His  research  interests  span  the areas of compilers,  embedded  systems,  programming  languages,  software engineering,  and  information security.  He has authored over 70 technical  papers,  co-authored  the  book  Object-Oriented  Type Systems, and co-authored the 2002 revision of Appel's textbook on Modern Compiler Implementation in Java.  He is the recipient  of National  Science  Foundation  CAREER  and  ITR  awards, a Purdue

University  Faculty  Scholar  award,  and  an  Okawa   Foundation research  award.   His research has also been supported by DARPA, IBM, Intel, and British Telecom.  He is an  associate  editor  of ACM  Transactions  of Programming Languages and Systems, a member of the editorial board of  Information  and  Computation,  and  a former  member  of  the  editorial  board of IEEE Transactions on Software Engineering.  He is serving as the general chair of  the ACM Symposium on Principles of Programming Languages in 2005.


Automatic Detection and Repair of Inconsistent Data Structures

Martin Rinard

MIT Lab for Computer Science

Oct. 27, 2003

 

Data structure corruption errors can cause a program to behave unacceptably or fail. This talk presents a system that accepts a specification of key data structure consistency properties, then (as the program executes) automatically detects and repairs violations of these properties. These repairs restore the consistency of the data structures and enable the software to continue to execute.

 

Our experience using our system indicates that the specifications are relatively easy to develop once one understands the data structures in question. Furthermore, for our set of benchmark applications, our system can effectively repair errors to deliver consistent data structures that allow the program to continue to operate succcessfully within its designed operating envelope.

 


 

Shallow Finite State Verification:  Abstraction

                Techniques and Complexity Results

 

G.Ramalingam

IBM T.J. Watson Research Center

Friday Oct 24, 2-3pm

 

 

The desire for more reliable software has led to increasing interest in extended static checking: statically verifying whether a program satisfies certain desirable properties. A technique that has received particular attention is that of finite state or typestate verification. In this model, objects of a given type may exist in one of several states; the operations permitted on an object depend on the state of the object, and the operations may potentially alter the state of the object. The goal of typestate verification is to statically determine if the execution of a given program may cause an operation to be performed on an object in a state where the operation is not permitted. Typestate verification can thus be used to check that objects satisfy certain kinds of temporal properties; for example, that an object is not used before it is initialized, or that a file is not used after it is closed.

 

 

In this talk, I consider the problem of _typestate verification_ for _shallow_ programs; i.e., programs where pointers from program variables to heap-allocated objects are allowed, but where heap-allocated objects may not themselves contain pointers.  I will present results relating the complexity of verification to the nature of the finite state machine used to specify the property.  Some properties are shown to be intractable, but others which appear to be quite similar admit polynomial-time verification algorithms.

 


 

From Local to Distributed

  (NRMI and J-Orchestra: Language Tools for Programming Distributed

   Applications)

 

Yannis Smaragdakis

Georgia Tech

Monday, Oct 13, 2003, 4pm

 

 

We present NRMI and J-Orchestra, two language tools for data-driven distributed programming.

 

NRMI is a middleware system supporting correct call-by-copy-restore semantics for arbitrary linked data structures. Call-by-copy-restore middleware is more natural than call-by-copy RPC mechanisms, often enabling remote calls to behave like local calls (e.g., in the important case of single-threaded clients and stateless servers).

 

J-Orchestra is an automatic partitioning system for Java programs. J-Orchestra takes as input Java applications in bytecode format and transforms them into distributed applications, running on distinct Java Virtual Machines. To accomplish such automatic partitioning, J-Orchestra uses bytecode rewriting to substitute method calls with remote method calls, direct object references with proxy references, etc. J-Orchestra has significant generality, flexibility, and degree of automation advantages compared to previous work on automatic partitioning.

 

Bio:

            http://www.cc.gatech.edu/~yannis/

Yannis Smaragdakis is an Assistant Professor of Computer Science at Georgia Tech. His research interests include object-oriented language design and implementation (esp. multiparadigm programming, OO components, distributed programming in OO languages); advanced tools to facilitate program construction (esp. program generators, domain-specific languages, software components); and memory management (esp. virtual memory systems).

 


Counterexample-Guided Model Checking

Maria Sorea

SRI

Monday, Sep. 29, 2003

 

 

The generation of counterexamples is frequently touted as one of the primary advantages of model checking as a verification technique. However, the generation of trace-like counterexamples is limited to a small fragment of branching-time temporal logic. When model checking does succeed in verifying a property, there is typically no independently checkable witness that can be used as evidence for the verified property. We present a definition of witnesses, and, dually, counterexamples, for computation-tree logic (CTL), and describe a model checking algorithm that is based on the generation of evidence. Our model checking algorithm is local in the sense that it explores only the reachable states, and in some cases, even less than the reachable states. It partitions the given initial set of states into those that do, and those that do not satisfy the given property, with a corresponding witness and counterexample that is independently verifiable. We have built a model checker based on these ideas that works quite efficiently despite the overhead of generating evidence.

 


Logical Algorithms

 

Harald Ganzinger

MPI Informatik (joint work with David McAllester, TTI, Chicago)

Thursday, Sep. 18, 2003

 

Nontrivial meta-complexity theorems, proved once for a programming language as a whole, facilitate the presentation and analysis of particular programs. The talk discusses several such theorems for bottom-up logic programs. The theorems apply to many problems on graphs, and in particular problems related to program analysis, if formulated as a logical algorithm. One may also view this work as an attempt to identify relationships between familiar concepts in logic (logical variables, term constructors, equality, saturation, prioritized strategies) and algorithmic paradims such as hashing, dynamic programming, union-find, and priority queues.

 


 


Spring’03

We are in the process of copying here the talk information


Fall’02

Fighting spam: Moderately hard memory-bound computations
Mike Burrows
Microsoft Research
Monday, Sept. 30 4-5 PM

A resource may be abused if its users incur little or no cost. For example, e-mail abuse is rampant because sending an e-mail has negligible cost for the sender. It has been suggested that such abuse may be discouraged by introducing an artificial cost in the form of a moderately expensive computation. Thus, the sender of an e-mail might be required to pay by computing for a few seconds before the e-mail is accepted. Unfortunately, because of sharp disparities across computer systems, this approach may be ineffective against malicious users with high-end systems, prohibitively slow for legitimate users with low-end systems, or both. Starting from this observation, we discuss moderately hard functions that most recent systems will evaluate at about the same speed. For this purpose, we consider memory-bound computations. This talk describes a family of moderately hard, memory-bound functions, and explains how to use them for protecting against abuses. The talk also includes a brief description of a related network service for spam deterrence. (Joint work with Martin Abadi, Mike Burrows, Mark Manasse, and Ted Wobber.)


Type Qualifiers: Lightweight Specifications to Improve Software Quality
Jeffrey S. Foster
UC Berkeley
Monday, Oct. 7 4-5 pm

Software continues to increase in size and complexity, yet programmers have few practical tools and techniques to ensure its quality. In this talk I will discuss type qualifiers, a lightweight, practical mechanism for specifying and checking properties not captured by traditional type systems. I will describe efficient type qualifier inference algorithms, including a lazy, constraint-based flow-sensitive inference algorithm that explicitly models pointers, heap-allocated data, and aliasing.

As part of my research I have built cqual, a tool for adding new type qualifiers to C. During the talk I will demonstrate two applications of cqual: finding format-string bugs, a particular kind of security vulnerability, in C programs, and detecting simple deadlocks in the Linux kernel. I will also discuss experiments in which we found a number of format-string bugs and potential deadlocks.

More information about cqual, including source code and a web-based demonstration, can be found at http://www.cs.berkeley.edu/~jfoster/cqual.


SLAM: Software Model Checking from Theory to Practice
Sriram Rajamani
Microsoft Research
Monday, Oct. 14 4-5 pm

The goal of the SLAM project is to analyze system software written in C, and check it against temporal safety properties. The SLAM toolkit uses model checking, program analysis and theorem proving techniques to do automatic analysis, and requires no user-supplied abstractions or annotations such as pre-conditions, post-conditions or loop invariants. We have successfully applied the SLAM toolkit to over 100 Windows device drivers, to both validate their behavior and find defects in their usage of kernel-level APIs. The SLAM toolkit currently is in transition from a research prototype to a product-grade tool that will be supported and used by the Windows development organization.

I will start by describing the ideas and algorithms behind SLAM. As is often the case, there is a gap between the theory and algorithms one publishes and the implementation of a tool that has been engineered to perform acceptably in practice. In the second half of the talk, I will detail some of the engineering "secrets" we applied to the basic SLAM process to improve its performance on Windows device drivers.

(this is joint work with Tom Ball of Microsoft Research, and several visitors and interns: Andreas Podelski, Rupak Majumdar, Todd Millstein, Sagar Chaki, Wes Weimer, Satyaki Das, Robby, Mayur Naik and Jakob Lichtenberg)

Bio: Sriram K. Rajamani is a researcher with Microsoft Research. His research interests are in tools and methodologies for building reliable software.


Storeless Semantics and Separation Logic
David Schmidt
Kansas State University

Monday, Oct. 28 4-5 pm

Jonkers and Deutsch have promoted ``storeless semantics,'' where locations and even the store/heap disappear and are replaced by paths of identifiers and field selectors. Axiomatic (Hoare-) logic is a classic example of a storeless semantics, and in this talk, we present Reynolds-O'Hearn separation logic, an axiomatic logic for reasoning about heap storage, and attempt to employ it in a storeless semantics for an applicative, object-based language.

Our goal is to systematically derive modular region- and escape-analyses as instances of separation-logic semantics.


Industrial-Strength Dynamic Race Detection
Robert O'Callahan
IBM Research
Monday, Nov. 25 4-5 pm

Data races are an important class of bugs because they are particularly difficult to detect using normal testing methods, and they are often difficult to understand. At IBM Watson we have been developing dynamic techniques for detecting potential data races in extremely large Java systems such as IBM's Websphere application server. In this talk I will describe work we did in 2001 to improve the efficiency and accuracy of lockset-based data race detection. (This work was reported in PLDI 2002.) Then I will explain the limitations of that work when applied to systems like Websphere, particularly in the areas of scalability, accuracy and usability, and how we are addressing those challenges. In particular I will describe how we adapt a different kind of race detection based on causality tracking (sometimes known as happens-before detection) from the field of distributed systems and combine it with lockset-based detection to reduce the number of false positives. I will present the results of applying our tool to three Web application servers: Apache Tomcat, Resin, and Websphere. Some of the results have implications for language and compiler design. I will argue that future progress depends on developing new techniques for diagnosing and explaining complex bugs in addition to just finding them.

This is joint work with Jong-Deok Choi, Manu Sridharan, Keunwoo Lee, Vivek Sarkar and Alexey Loginov.


How to Get Annotations and Specifications into Industrial Code: Three easy lessons
Daniel Weise
Microsoft Research
Monday, Dec. 2 4-5 pm

In response to necessity for more reliable products, Microsoft development groups are rapidly moving towards code annotations and domain specific checkers for those annotations. Yet common wisdom from the academic community had it that you could never get developers to annotate their code. How did this happen? This talk will cover the current state of code annotation and checking technology at Microsoft, how we got developers to annotate code, and where we are headed.

Bio: Daniel Weise has advanced degrees from a respectable place (M.I.T.), been faculty at a nearly respectable place (Stanford University), and run a successful research group at a controversial place (Microsoft Research, where his research group designed and built technology that is having tremendous impact on Microsoft products). In spite of this, he now insists on slumming in product groups because that's where he believes the important action is, and that that's how to best effect change and how to get the good ideas from research put into practice and have research ideas informed by practice.


Proving Theorems about Java and the JVM
J. Strother Moore
University of Texas, Austin
Thursday, Dec. 5 2-3 pm

I will explain how the latest version of the Boyer-Moore theorem prover, ACL2, is used to prove theorems about Java methods. ACL2 may be thought of as a theorem prover for functional Common Lisp. We model the Java Virtual Machine (JVM) operationally, e.g., as a Lisp function. We prove theorems about Java methods by compiling the methods with javac and proving theorems about the execution of the resulting bytecode by the JVM model. I will discuss the general approach and show some examples. The examples will deal with Java's int arithmetic, simple control, including recursion, object creation and modification in the heap, thread creation, and mutual exclusion via monitors.

Bio: J Strother Moore holds the Admiral B.R. Inman Centennial Chair in Computing Theory at the University of Texas at Austin. He is also chair of the department. He is the author of many books and papers on automated theorem proving and mechanical verification of computing systems. Along with Boyer he is a co-author of the Boyer-Moore theorem prover and the Boyer-Moore fast string searching algorithm. With Matt Kaufmann he is the co-author of the ACL2 theorem prover. Moore got his PhD from the University of Edinburgh in 1973 and his BS from MIT in 1970. Moore was a founder of Computational Logic, Inc., and served as its chief scientist for ten years. He and Bob Boyer were awarded the 1991 Current Prize in Automatic Theorem Proving by the American Mathematical Society. In 1999 they were awarded the Herbrand Award for their work in automatic theorem proving. Moore is a Fellow of the American Association for Artificial Intelligence.


Algorithms for and the Complexity of Subtype Entailment
Zhendong Su
UC Berkeley
Monday, Dec. 9 4-5 PM

The decidability of subtype entailment is an important open problem in the design and implementation of scalable and expressive subtype systems and type-based program analyses. The problem, so far, has resisted many competent attacks. In this talk, I will report recent progress towards resolving this problem. In particular, I will show that the first-order theory of subtyping constraints is undecidable, and in the case where all type constructors are unary and constants, it is decidable via an automata-theoretic reduction. In addition, I will present efficient algorithms for entailment with conditional equality constraints, arguably the simplest type language with subtyping.


Open Runtime Platform: Flexibility with Performance using Interfaces
Michal Cierniak
Intel
Monday, Feb. 3 4-5 pm

According to conventional wisdom, interfaces provide flexibility at the cost of performance. Most high-performance Java virtual machines today tightly integrate their core virtual machines with their just-in-time compilers and garbage collectors to get the best performance. The Open Runtime Platform (ORP) is unusual in that it reconciles high performance with the extensive use of well-defined interfaces between its components. ORP was developed to support experiments in dynamic compilation, garbage collection, synchronization, and other technologies. To achieve this, two key interfaces were designed: one for garbage collection and another for just-in-time compilation. This paper describes some interesting features of these interfaces and discusses lessons learned in their use. One lesson we learned was to selectively expose small but frequently accessed data structures in our interfaces; this improves performance while minimizing the number of interface crossings.

(This work will be presented at Java Grande)


Spring 2002

Towards the Development of Robust Device Drivers

Prof. Laurent Réveillère

Enseirb
Monday, Feb. 4   4-5 pm

 

 

Device drivers represent an increasing proportion of  operating system code -- more than 70% for Linux.   Nevertheless, although device drivers are a critical part of an operating system, the process of their development remains rudimentary and requires a high level of expertise.  This situation is demonstrated by a recent study that shows that the propensity of device drivers to contain errors is up to seven times higher than the rest of the kernel.

 

In this talk, I will present a new approach to ease the  development of device drivers. This approach is based on a domain-specific language, named Devil, targeted towards  specifying the programming interface of a device.  The processing of a Devil specification begins by its analysis to detect inconsistencies.  The code necessary to implement the communication between the device and the driver is then automatically generated.  This code can be generated in one of two forms, depending on whether error checking or run-time performance is favored. I will assess the robustness improvement of Devil based drivers by applying a mutation analysis.

 

Laurent Réveillère received his PhD from University of Rennes I in 2001. Currently, he holds a position of Assistant Professor at Enseirb, an engineering school located in Bordeaux (France). Laurent Réveillère's main research interest lies in the area of domain-specific languages design and implementation.


Optimizing Component Interaction

Mark Wegman

IBM TJ Watson Research Center

Thursday, Feb. 7 3:30-5:00
380 Soda Hall

 

Programmers are increasingly programming  by component assembly. This is despite the  fact  that  the  use  of  components  tend  to  decrease  the performance of the program. The programers who design the components do not know anything about the environment the components will be used in, and can therefore  not tailor the components for each specific use.  For modularity reasons  the  users  of  components  should  not know what is in them.  The efficiency  of  a  program suffers from the use of such generic components, since  it  fundamentally depends on the interactions between the components (including  calling  callee  interactions).  The performance of a component based program is therefore often less than optimal. Optimizing component interaction is a fundamental problem in constructing efficient  component-oriented  programs. Our work lays out a foundation for reformulating  the  implementation  of  the interaction between components, with  the  goal of optimizing the assembled system. Our approach amounts to automatically   choosing   among   a  number  of  candidate  implementation alternatives  for  each of the data structures and communication mechanisms that  the  components use during their interactions. This choice is done by an  interesting  combination  of  dynamic  and static analysis.  The system becomes  aware  of alternatives one component can offer to another and what kinds of uses another requires and so we can get the two interacting better via  the  system  than  seems permissible with  good software engineering practice. Some  of  our  initial  work  in  this  area has shown how to transform one example of component interaction optimization into a graph problem, and has proposed  analysis  of  dynamics  and fast graph reduction heuristics as an efficient technique for solving the problem. Our hope is that this work will inspire debate that ultimately will lead to a  new  area  of  investigation,  within  the  field  of compiler-performed optimization,   with   the   potential   to  achieve  orders  of  magnitude improvements in the performance of component-oriented programs.

 

 

 


 

Practical Linear Types for Imperative Programming

Manuel Faehndrich

Microsoft Research

Monday, February 11   4-5 pm

 

VAULT is a programming language combining aspects of C and ML. There are two reasons for creating a new language. First, to combine programmer control over data layout and lifetime as in C or C++ with the memory safety of ML or Java.  Second, to capture usage rules in interface descriptions in order to check both clients and the implementation itself.  Usage rules enable stating for example the valid order of function calls and the ownership transfers of data. Both memory safety and usage rules are expressed as part of VAULT's type system, using a form of linear types.

 

Linearity provides powerful reasoning about state changes, but at the price of restrictions on aliasing. The hard division between linear and nonlinear types forces the programmer to make a trade-off between checking a protocol on an object and unrestricted aliasing of the object. Most onerous is the restriction that any type with a linear component must itself be linear. We propose two constructs, adoption and focus, with new typing rules that reduce these restrictions. Adoption safely allows aliasing of objects on which a protocol is being enforced, whereas focus allows the reverse.


Real-Time Dynamic Atomic Broadcast

Nancy Lynch

MIT

Tuesday, February 12  3:30-4:30 pm

320 Soda Hall

 

This talk introduces a problem of atomic broadcast with real-timelatency guarantees in a dynamic setting where processes may join,

leave voluntarily, or fail (by stopping) during the course of computation.  We provide a formal definition of the Real-Time Dynamic Atomic Broadcast problem and present and analyze a new algorithm for its solution.

 

The algorithm exhibits constant message delivery latency in the absence of failures, even during periods when participants join or leave.  When failures occur, the latency bound is linear in the number of actual failures.  These bounds improve upon previously suggested algorithms solving similar problems in the context of view-oriented group communication.

 

Our algorithm uses a solution to a variation on the standard distributed consensus problem, in which participants do not know a priori who the other participants are.  We define the new problem, which we call Consensus with Unknown Participants, and give an early-stopping algorithm to solve it.

 

Based on joint work with Ziv Bar-Joseph and Idit Keidar

 

 


Trust Management and Distributed Access Control

John Mitchell

Stanford University

Monday, February 25   4-5 pm

 

Trust Management is an approach to distributed access control that combines policies, digital signatures, and logical deduction. This talk will describe some aspects of a project that includes a policy language framework, deduction engine, and secure mobile code mechanisms for distributed access control. Two example applications are distributed scheduling (calendar) and web-based file sharing systems.

 

This talk includes joint work with Ajay Chander, Drew Dean, Ninghui Li, Will Winsborough, and others.

 

 

 


ToonTalk - Programming From Inside an Animated Game-like World

Ken Kahn

Animated Programs

Monday, March 4   4-5 pm

 

 

We have constructed a general-purpose concurrent programming system, ToonTalk, in which the source code is animated and the programming environment is a video game.  Every abstract computational aspect is mapped into a concrete metaphor.  For example, a computation is a city, a concurrent object is a house, birds carry messages between houses, a method or clause is a robot trained by the user and so on.  The programmer controls a "programmer persona" in this animated world to construct, run, debug and modify programs.  ToonTalk also includes a tutorial game that teaches programming. We believe that ToonTalk is especially well suited for giving children the opportunity to build real programs in a manner that is easy to learn and fun to do. See http://www.toontalk.com for more information. A live demo of ToonTalk will be given.

 

ToonTalk was designed and built by Ken Kahn <KenKahn@ToonTalk.com> who, after earning a doctorate in computer science from MIT, spent 20 years as a researcher in programming languages, computer animation, and programming systems for children.  He has been a faculty member at MIT, University of Stockholm, and Uppsala University.  For over eight years he was a researcher at Xerox PARC. In 1992, Ken founded Animated Programs whose mission is to make computer programming child's play.

 


Jim Larus

Microsoft

Monday, March 11


Craig Chambers

University of Washington

Monday, May 13,  4-5 pm