Michael Ernst's research interests

I am recruiting motivated students who are interested in software engineering, programming languages, security, testing, and related topics.

Also see:

my publications by date
list of potential research projects and areas for students and collaborators, from undergraduate through PhD and beyond (available only from the washington.edu domain; if you are not at UW, then see my current projects for a flavor)
list of students collaborating with me
You can be one of them!
distributed software

My primary research area is programmer productivity, which spans the spectrum from software engineering, to program analysis (both static and dynamic), to programming language design. My broader computer science research interests range over a wide variety of fascinating topics, with no end to the fun in sight. Here are the ones about which I have published. (Each paper is listed only once, and superseded papers are not listed.)

Contents:

Programming language design
Immutability (side effects)
Ownership (encapsulation)
Static analysis
Synthesis
Concurrency
Distributed systems
Verification
Security
Refactoring
Defect prediction
Test generation
Testing
Debugging
Program repair
Specification inference
Dynamic analysis
Speculative analysis
Software engineering
User interfaces
Natural language processing
Theory
Education
Miscellaneous
Mobile applications

Programming language design

IR '95: Intermediate Representations Workshop Proceedings (1995): This workshop examined current trends and research in the design and use of intermediate representations for manipulating computer programs.
Predicate dispatching: A unified theory of dispatch (ECOOP '98: the 12th European Conference on Object-Oriented Programming, 1998): Predicate dispatching generalizes and subsumes previous method dispatch mechanisms (object-oriented dispatch, pattern matching, and more) by permitting arbitrary predicates to control method applicability and by using logical implication between predicates as the overriding relationship.
An empirical analysis of C preprocessor use (IEEE Transactions on Software Engineering, 2002): Undisciplined use of the C preprocessor can greatly hinder program understanding and manipulation. This paper examines 1.2 million lines of C code to determine what types of problematic uses appear in practice.
Using predicate fields in a highly flexible industrial control system (OOPSLA 2005, Object-Oriented Programming Systems, Languages, and Applications, 2005): Predicate-oriented programming has not previously been evaluated outside a research context. This paper describes a case study that evaluates the use of predicate fields in a large-scale application that is in daily use.
Practical pluggable types for Java (ISSTA 2008, Proceedings of the 2008 International Symposium on Software Testing and Analysis, 2008): We have created a framework for pluggable type-checking in Java (including backward-compatible syntax). Checkers built with the framework found real errors in existing software.
Always-available static and dynamic feedback (ICSE 2011, Proceedings of the 33rd International Conference on Software Engineering, 2011): The DuctileJ language enables the rapid development and flexibility of a dynamically-typed language, and the reliability and comprehensibility of a statically-typed language, via two views of the program during development.
Building and using pluggable type-checkers (ICSE 2011, Proceedings of the 33rd International Conference on Software Engineering, 2011): This paper evaluates the ease of pluggable type-checking with the Checker Framework. The type checkers were easy to write, easy for novices to use, and effective in finding hundreds of errors in case studies of 2 million lines of code.
Annotations on Java types (2014): This specification extends Java's annotation system to permit annotations on any use of a type (including generic type arguments, method receivers, etc.). This is the Proposed Final Draft, which is included in Java 8.

Immutability (side effects)

A practical type system and language for reference immutability (OOPSLA 2004, Object-Oriented Programming Systems, Languages, and Applications, 2004): This paper presents a type system, language, implementation, and evaluation of a safe mechanism for enforcing reference immutability, where an immutable reference may not be used to cause side effects to any object reachable from it.
Javari: Adding reference immutability to Java (OOPSLA 2005, Object-Oriented Programming Systems, Languages, and Applications, 2005): A compiler can guarantee that an immutable reference is not used to cause side effects to any reachable object. This paper extends previous proposals in many ways, including formal type rules and support for Java generics.
Object and reference immutability using Java generics (ESEC/FSE 2007: Proceedings of the 11th European Software Engineering Conference and the 15th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007): This paper shows how to provide compiler-checked guarantees of both object immutability and reference immutability, without changes to Java syntax, by providing a single “immutability” generic parameter for each class.
Inference of reference immutability (ECOOP 2008 — Object-Oriented Programming, 22nd European Conference, 2008): This paper presents a precise, scalable, novel algorithm for inference of reference immutability (as defined by the Javari language), and an experimental evaluation on substantial programs.
Parameter reference immutability: Formal definition, inference tool, and comparison (Automated Software Engineering, 2009): This paper defines reference immutability formally and precisely. The paper gives new algorithms for inferring immutability that are more scalable and precise than previous approaches. The paper compares our algorithms, previous algorithms, and the formal definition.
Featherweight Ownership and Immutability Generic Java (FOIGJ) (TR, 2009): This paper prestents a formalism and proofs of a type system (OIGJ) that combines ownership with immutablity. This report has since been superseded.
Ownership and immutability in generic Java (OOPSLA 2010, Object-Oriented Programming Systems, Languages, and Applications, 2010): This paper shows how to combine ownership and immutability, producing a result that is more expressive and safe than previous attempts. The paper includes both a proof of soundness and an implementation with case studies.
ReIm & ReImInfer: Checking and inference of reference immutability and method purity (OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications, 2012): We present a new variant of reference immutability, along with efficient type inference and type checking algorithms and implementations.
Immutability (Aliasing in Object-Oriented Programming, 2013): This review chapter discusses immutability in object-oriented languages, with particular focus on how it impacts aliasing.
Rely-guarantee references for refinement types over aliased mutable data (PLDI 2013: Proceedings of the ACM SIGPLAN 2013 Conference on Programming Language Design and Implementation, 2013): This paper presents a new type system for reasoning about side effects in the presence of aliasing. Its technical approach unifies reference immutability type systems and rely-guarantee program logics.

Ownership (encapsulation)

Featherweight Ownership and Immutability Generic Java (FOIGJ) (TR, 2009): This paper prestents a formalism and proofs of a type system (OIGJ) that combines ownership with immutablity. This report has since been superseded.
Ownership and immutability in generic Java (OOPSLA 2010, Object-Oriented Programming Systems, Languages, and Applications, 2010): This paper shows how to combine ownership and immutability, producing a result that is more expressive and safe than previous attempts. The paper includes both a proof of soundness and an implementation with case studies.
Tunable static inference for Generic Universe Types (ECOOP 2011 — Object-Oriented Programming, 25th European Conference, 2011): Type inference for ownershup types is underconstrained: many different solutions are legal. This paper transforms the inference problem into an instance of Weighted Max-SAT to find a good inference solution.
Inference and checking of object ownership (ECOOP 2012 — Object-Oriented Programming, 26th European Conference, 2012): This paper generalizes previous work on ownership inference, showing how a single algorithm can apply to multiple ownership type systems and choose a desirable typing among many possible ones.

Static analysis

Value dependence graphs: Representation without taxation (POPL '94: Proceedings of the 21st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1994): This paper describes the value dependence graph (VDG), a sparse, functional, dataflow-like program representation based on demand dependences. It also presents some of the analyses and transformations that the VDG simplifies.
Practical fine-grained static slicing of optimized code (TR, 1994): Slicing helps visualize dependences and restrict attention relevant program components. This paper describes techniques, and an implementation, for slicing on arbitrary program values, omitting irrelevant parts of statements, and exploiting optimizations.
Slicing pointers and procedures (abstract) (TR, 1995): This paper describes how to efficiently extend program slicing to arbitrary pointer manipulations, aggregate values, and procedure calls. The implementation is the first for an entire practical programming language.
Selecting, refining, and evaluating predicates for program analysis (TR, 2003): This paper investigates several techniques, notably dynamic ones based on random selection and machine learning, for predicate abstraction — selecting the predicates that are most effective for a program analysis.
Inference of field initialization (ICSE 2011, Proceedings of the 33rd International Conference on Software Engineering, 2011): This paper presents a flow-sensitive, exception-aware static analysis that infers what program variables might hold not-yet-fully-initialized objects. Such an object might violate its object invariants, until its initialization is complete.
HAMPI: A solver for word equations over strings, regular expressions, and context-free grammars (ACM Transactions on Software Engineering and Methodology, 2012): This paper describes an efficient and expressive solver (that is, a decision procedure) for string constraints such as language membership. Such a solver is useful when analyzing programs that manipulate strings.
Cascade: A universal programmer-assisted type qualifier inference tool (ICSE 2015, Proceedings of the 37th International Conference on Software Engineering, 2015): Rather than operating a type inference tool in batch mode, this paper proposes that the tool should be run incrementally, with the programmer approving each step and being permitted to edit the program along the way.
Static analysis of implicit control flow: Resolving Java reflection and Android intents (ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, 2015): Standard program analysis understands control flow through direct procedure calls. This paper provides analyses that resolve uses of Java reflecton and the data passed in Android intents.
Boolean formulas for the static identification of injection attacks in Java (LPAR 2015: Proceedings of the 20th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, 2015): This paper describes a flow- and context-sensitive static analysis that automatically identifies where injections of tainted data can occur in a program.
Notes on Program Analysis (2022): This book is an introduction to program analysis. It covers abstract interpretation (dataflow analysis), testing, and other topics. It is appropriate for self-study or for use in a graduate or advanced undergraduate class.
Nothing is better than the Optional type. Really. Nothing is better. (Java Magazine, 2022): The blogosphere is full of claims that the Optional class solves the problem of null pointer exceptions. This article explains why those claims are not true, and a better way to prevent null pointer exceptions.
Pluggable type inference for free (ASE 2023: Proceedings of the 38th Annual International Conference on Automated Software Engineering, 2023): This paper describes a type inference framework. Given a pluggable type-checker implementation, the framework automatically creates a type inference tool.
Inference of resource management specifications (Proceedings of the ACM on Programming Languages, 2023): This paper proposes an optimistic analysis to infer specifications about how a program uses and releases resources. Inference reduces the programmer burden to use a specify-and-verify approach to preventing resource leaks.
Call graph soundness in Android static analysis (ISSTA 2024, Proceedings of the 2024 International Symposium on Software Testing and Analysis, 2024): We analyzed 13 static analyzers for Android code. They are very unsound: they failed to analyze 61%, on average, of the dynamically-executed methods. Tools were bad at discovering entry points, higher-precision tools were most unsound.

Synthesis

Fast synthesis of fast collections (PLDI 2016: Proceedings of the ACM SIGPLAN 2016 Conference on Programming Language Design and Implementation, 2016): The Cozy tool takes as input a specification for a collection data structure and query operations, and produces as output an efficient implementation. Cozy decomposes the problem into two parts by first synthesizing an “outline” in terms of high-level operations.
Program synthesis from natural language using recurrent neural networks (TR, 2017): Tellina is a system that translates an English description of a file system operation into a bash command. A controlled user experiment shows that it helps programmers perform file system tasks more quickly.
Generalized data structure synthesis (ICSE 2018, Proceedings of the 40th International Conference on Software Engineering, 2018): This paper shows how to synthesize a data structure that tracks subsets and aggregations of multiple related collections. The input is a trivial implementation of the desired API operations, and the output is an efficient implementation.
NL2Bash: A corpus and semantic parser for natural language interface to the Linux operating system (LREC: Language Resources and Evaluation Conference, 2018): NL2Bash is a dataset containing pairs of (English text, Bash commands). This dataset can be used in training and evaluating tools that translate from English to bash.

Concurrency

Serializing parallel programs by removing redundant computation (TR, 1994): This paper introduces and evaluates methods for automatically transforming a parallel algorithm into a less parallel one that takes advantage of only the parallelism available at run time, thus performing less computation to produce the same results.
Static deadlock detection for Java libraries (ECOOP 2005 — Object-Oriented Programming, 19th European Conference, 2005): This paper gives a technique for determining whether any invocation of a library can lead to deadlock in the client program; it can also be extended to the closed-world (whole-program) case.
Refactoring sequential Java code for concurrency via concurrent libraries (ICSE 2009, Proceedings of the 31st International Conference on Software Engineering, 2009): This paper presents a refactoring tool, Concurrencer, that transforms sequential code into parallel code that uses the java.util.concurrent libraries.
How do programs become more concurrent? A story of program transformations (Proceedings of the 4th International Workshop on Multicore Software Engineering, 2011): This paper is a historical analysis of the transformations that programmers used to convert sequential programs into concurrent versions.
Static lock capabilities for deadlock freedom (TLDI 2012: The seventh ACM SIGPLAN Workshop on Types in Language Design and Implementation, 2012): This paper prevents deadlock by forcing the locking discipline to follow pointers in the heap (or any other tree structure). It permits fine-grained locking, arbitrary lock acquisition orders, changing the locking order, etc.
Finding errors in multithreaded GUI applications (ISSTA 2012, Proceedings of the 2012 International Symposium on Software Testing and Analysis, 2012): In a GUI program, only the main thread (event thread) should access GUI objects. Our analysis verifies this property, and found crashing errors in Android, Eclipse plugin, Swing, and SWT applications.
JavaUI: Effects for controlling UI object access (ECOOP 2013 — Object-Oriented Programming, 27th European Conference, 2013): In a GUI program, only the main thread (event thread) should access GUI objects. Our sound static analysis verifies this property through an effect system, and scaled to 140KLOC.
Inferring models of concurrent systems from logs of their behavior with CSight (ICSE 2014, Proceedings of the 36th International Conference on Software Engineering, 2014): This paper provides a new approach to inferring a communicating finite state machine (CFSM) from executions of a concurrent system. The paper also proves its techniques correct and evaluates an implementation on system logs and a user study.
Locking discipline inference and checking (ICSE 2016, Proceedings of the 38th International Conference on Software Engineering, 2016): This paper describes an abstract inference and a type system and for inferring and checking a locking discipline that provides value protection. It also compares programmer-written annotations to implementations of each.
Semantics for locking specifications (NFM 2016: 8th NASA Formal Methods Symposium, 2016): Java's @GuardedBy annotation is intended to prevent race conditions, but current tools give it a definition that permits race conditions. This paper formalizes a correct definition that prevents race conditions as intended.
Verifying Invariants of Lock-free Data Structures with Rely-Guarantee and Refinement Types (ACM Transactions on Programming Languages and Systems, 2017): This paper presents a new approach to verifying lock-free concurrent data structures that handles more complex code: view references as capabilities and verify them using rely-guarantee reasoning.

Distributed systems

Using simulated execution in verifying distributed algorithms (Software Tools for Technology Transfer, 2004): This paper proposes integration of dynamic analysis with traditional theorem-proving in ways that extend beyond mere testing. Generalizing over the test runs can reveal necessary lemmas, and the structure of the proof can mirror the structure of the execution.
Leveraging existing instrumentation to automatically infer invariant-constrained models (ESEC/FSE 2011: The 8th joint meeting of the European Software Engineering Conference (ESEC) and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), 2011): Synoptic infers, from system logs, a finite state machine model of the system. Synoptic's novel inference algorithm efficiently leads to compact but accurate models. Programmers found Synoptic useful for bug detection and other tasks.
Bandsaw: Log-powered test scenario generation for distributed systems (SOSP WIP: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Work In Progress Track, 2011): This short paper proposes a way to automatically generate test scenarios (and eventually full tests), by mining existing logs, modeling the results, and generating new traces from the models.
Mining temporal invariants from partially ordered logs (SIGOPS Operating Systems Review, 2011): Totally ordered (linear) logs have been well-studied in the invariant mining community, but many real logs (e.g., from distributed systems) are partially ordered. This paper gives algorithms for mining invariants from partially ordered logs.
Shedding light on distributed system executions (ICSE 2014, Proceedings of the 36th International Conference on Software Engineering, 2014): This paper presents two dools to help developers comprehend distributed systems: ShiVector to add vector timestamps to distributed system logs and ShiViz to visualize such logs as space-time diagrams.
Verdi: A framework for implementing and formally verifying distributed systems (PLDI 2015: Proceedings of the ACM SIGPLAN 2015 Conference on Programming Language Design and Implementation, 2015): This paper presents Verdi, a methodology and toolset for modularly implementing and formally verifying practical distributed systems in Coq. Case studies include a key-value store and a primary-backup replication mechanism.
Planning for change in a formal verification of the Raft consensus protocol (CPP 2016: 5th ACM SIGPLAN Conference on Certified Programs and Proofs, 2016): This paper provides a formal proof of correctness for a runnable implementation of the widely-used Raft consensus algorithm. It also discusses how to manage the incremental process of constructing such a proof.
Scalable verification of Border Gateway Protocol configurations with an SMT solver (OOPSLA 2016, Object-Oriented Programming Systems, Languages, and Applications, 2016): This paper presents Bagpipe, a verification tool for BGP configurations. Bagpipe has been proven correct, which required a mechanised semantics for BGP. Bagpipe found multiple errors in real-world BGP configurations.
Teaching rigorous distributed systems with efficient model checking (EuroSys, 2019): This paper introduces shows an effective way to teach distributed systems, with a set of labs, and a model checker and visual debugger used by students to verify and debug their solutions.
Visualizing Distributed System Executions (ACM Transactions on Software Engineering and Methodology, 2020): This paper presents two dools to help developers comprehend distributed systems: ShiVector to add vector timestamps to distributed system logs and ShiViz to visualize such logs as space-time diagrams.

Verification

Static verification of dynamically detected program invariants: Integrating Daikon and ESC/Java (RV 2001: Proceedings of the First Workshop on Runtime Verification, 2001): Dynamic invariant detection proposes likely (not certain) invariants; static verification requires program explicit identification of a goal and program annotation. This paper combines the two techniques to overcome the weaknesses of each.
Automatic generation of program specifications (ISSTA 2002, Proceedings of the 2002 International Symposium on Software Testing and Analysis, 2002): Sound program verifiers generally require program specifications, which are tedious and difficult to generate. A dynamic analysis can automatically produce unsound specifications. Combining the two techniques overcomes both weaknesses and demonstrates that the dynamic step, while unsound, can be quite accurate in practice.
Invariant inference for static checking: An empirical evaluation (FSE 2002, Proceedings of the ACM SIGSOFT 10th International Symposium on the Foundations of Software Engineering, 2002): Tool unsoundness and incompleteness may hinder users performing a task, or users may benefit even from imperfect output. In a study of program verification aided by dynamic invariant detection, even very poor output from the invariant detector aided users to a statistically significant degree.
Using simulated execution in verifying distributed algorithms (Software Tools for Technology Transfer, 2004): This paper proposes integration of dynamic analysis with traditional theorem-proving in ways that extend beyond mere testing. Generalizing over the test runs can reveal necessary lemmas, and the structure of the proof can mirror the structure of the execution.
An overview of JML tools and applications (Software Tools for Technology Transfer, 2005): This paper overviews the Java Modeling Language (JML) notation for detailed design and gives a brief description of some of the tools that take it as an input or produce it as an output.
Verification for legacy programs (VSTTE 2005: Verified Software: Theories, Tools, Experiments, 2005): Legacy (under-documented and -specified) code will be with us forever. This paper suggests ways to cope with such systems, both to ameliorate short-term problems and to advance toward a future in which all code is verified.
A type system for regular expressions (FTfJP: 14th Workshop on Formal Techniques for Java-like Programs, 2012): A program that uses regular expressions can fail at run time due to improper regex syntax or due to access of a non-existent capturing group. This paper presents a type system that detects and prevents such errors at compile time.
Verification games: Making verification fun (FTfJP: 14th Workshop on Formal Techniques for Java-like Programs, 2012): This paper presents a system that takes as input a program and a property, and produces as output a game. When a player finishes a level of the game, the final configuration of board elements can be translated to a proof of the property.
Reducing the barriers to writing verified specifications (OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications, 2012): We present a novel user interface that enables ordinary developers, who have not been trained in formal verification, to perform such tasks quickly and efficiently. It also decomposes verification tasks, enabling multiple developers to collaborate.
Case studies and tools for contract specifications (ICSE 2014, Proceedings of the 36th International Conference on Software Engineering, 2014): How do developers use Code Contracts in practice? How does developers' use of Code Contracts compare to true contracts that they could have written? How do developers react when shown the difference? This paper answers these questions.
A type system for format strings (ISSTA 2014, Proceedings of the 2014 International Symposium on Software Testing and Analysis, 2014): This paper presents a type system that guarantees that invocations of format routines, such as printf, will not fail: the format string is well-formed and the arguments are compatible with the format string.
Collaborative verification of information flow for a high-assurance app store (CCS 2014: Proceedings of the 21st ACM Conference on Computer and Communications Security, 2014): This paper presents a practical, context-sensitive, flow-sensitive type system that verifies that an app satisfies an information flow policy. The system was effective in an adversarial Red Team evaluation.
Toward a dependability case language and workflow for a radiation therapy system (SNAPL 2015: the Inaugural Summit oN Advances in Programming Languages, 2015): This paper proposes the use of interactive theorem proving, solver-aided languages, and formally defined domain-specific languages for the development of a safety-critical radiotherapy medical device.
Verdi: A framework for implementing and formally verifying distributed systems (PLDI 2015: Proceedings of the ACM SIGPLAN 2015 Conference on Programming Language Design and Implementation, 2015): This paper presents Verdi, a methodology and toolset for modularly implementing and formally verifying practical distributed systems in Coq. Case studies include a key-value store and a primary-backup replication mechanism.
Lessons learned in game development for crowdsourced software formal verification (3GSE: USENIX Summit on Gaming, Games, and Gamification in Security Education, 2015): Crowd-sourcing provides the promise of making formal verification cheaper and thus more prevalent. This paper overviews 5 games by which unskilled people can perform program verification tasks.
Planning for change in a formal verification of the Raft consensus protocol (CPP 2016: 5th ACM SIGPLAN Conference on Certified Programs and Proofs, 2016): This paper provides a formal proof of correctness for a runnable implementation of the widely-used Raft consensus algorithm. It also discusses how to manage the incremental process of constructing such a proof.
Locking discipline inference and checking (ICSE 2016, Proceedings of the 38th International Conference on Software Engineering, 2016): This paper describes an abstract inference and a type system and for inferring and checking a locking discipline that provides value protection. It also compares programmer-written annotations to implementations of each.
Semantics for locking specifications (NFM 2016: 8th NASA Formal Methods Symposium, 2016): Java's @GuardedBy annotation is intended to prevent race conditions, but current tools give it a definition that permits race conditions. This paper formalizes a correct definition that prevents race conditions as intended.
Investigating safety of a radiotherapy machine using system models with pluggable checkers (CAV 2016: 28th International Conference on Computer Aided Verification, 2016): This paper describes a site of tools that we built to verify properties of a clinical radiotherapy system, and a Safety Case Checker (SCC) that connects them into a end-to-end safety argument.
SpaceSearch: A library for building and verifying solver-aided tools (ICFP 2017: Proceedings of the 22nd ACM SIGPLAN International Conference on Functional Programming, 2017): SpaceSearch is a library for developing solver-aided tools within a proof assistant. It enables the creation of provably correct tools that transform some problem from its original doman to that of a highly optimized solver.
Automatic formal verification for EPICS (ICALEPCS 2017: 16th International Conference on Accelerator and Large Experimental Physics Control Systems, 2017): This paper presents a symbolic interpreter that detects errors in EPICS programs. It also presents a formal semantics for the EPICS language, which revealed omissions and ambiguities in the EPICS specification.
Lightweight verification of array indexing (ISSTA 2018, Proceedings of the 2018 International Symposium on Software Testing and Analysis, 2018): This paper presents a fast, precise, scalable, easy-to-use, static verification technique for proving that all array accesses are within their bounds, based on multiple simple cooperating type systems.
Verifying Object Construction (ICSE 2020, Proceedings of the 42nd International Conference on Software Engineering, 2020): This paper introduces accumulation analysis, a restricted form of typestate that requires no alias analysis. It can verify correct use of the builder pattern and related patterns.
Continuous compliance (ASE 2020: Proceedings of the 35th Annual International Conference on Automated Software Engineering, 2020): Formal verification is a good match for compliance requirements that are imposed by customers on vendors. We deployed compliance verification tools in industry and ran case studies on over 60 million lines of code.
Verifying determinism in sequential programs (ICSE 2021, Proceedings of the 43rd International Conference on Software Engineering, 2021): This paper presents a type system that captures whether a value is deterministic (the same across runs), nondeterministic, or order-nondeterministic (each collection has the same contents across runs, but possibly in a different order).
Lightweight and modular resource leak verification (ESEC/FSE 2021: The ACM 29th joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2021): This paper presents a modular approach to detecting resource leaks that does not require a heavy-weight whole-program analysis. The approach is sound, is applicable to existing code, and is easy for developers to use.
Accumulation analysis (artifact) (DARTS, Dagstuhl Artifacts Series, 2022): This short paper discusses a literature review of research on typestate analysis, and provides the experimental data.
Accumulation analysis (ECOOP 2022 — Object-Oriented Programming, 33rd European Conference, 2022): This paper proves that accumulation typestate specifications can be checked soundly without aliasing information. Further, 41% of the typestate specifications in the research literature are accumulation typestate specifications.

Security

A simulation-based proof technique for dynamic information flow (PLAS 2007: ACM SIGPLAN Workshop on Programming Languages and Analysis for Security, 2007): A dynamic analysis of information flow (e.g., end-to-end confidentiality) is sound if its estimates of the amount of information revealed can never be too low. This paper proves such a soundness result by simulating an analyzed program with a pair of copies connected by a limited-bandwidth channel.
Quantitative information flow as network flow capacity (PLDI 2008: Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, 2008): To obtain a more precise measurement of the amount of secret information a program might reveal, we replace the usual technique of tainting (reachability from secret inputs) with a maximum-flow computation on a graph representation of execution with edge capacities. With appropriate optimizations, the technique scales to check realistic properties in several large C programs.
Automatic creation of SQL injection and cross-site scripting attacks (ICSE 2009, Proceedings of the 31st International Conference on Software Engineering, 2009): By generating inputs that cause the program to execute particular statements, then modifying those inputs into attack vectors, it is possible to prove the presence of real, exploitable security vulnerabilities in PHP programs.
Automatically patching errors in deployed software (SOSP 2009, Proceedings of the 22nd ACM Symposium on Operating Systems Principles, 2009): The ClearView system automatically: classifies executions as normal or attack; learns a model of the normal executions; generates patches that correct deviations from the model during an attack; evaluates the patches; and distributes the best one. A hostile Red Team evaluation shows that ClearView is effective.
Collaborative verification of information flow for a high-assurance app store (CCS 2014: Proceedings of the 21st ACM Conference on Computer and Communications Security, 2014): This paper presents a practical, context-sensitive, flow-sensitive type system that verifies that an app satisfies an information flow policy. The system was effective in an adversarial Red Team evaluation.
Boolean formulas for the static identification of injection attacks in Java (LPAR 2015: Proceedings of the 20th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, 2015): This paper describes a flow- and context-sensitive static analysis that automatically identifies where injections of tainted data can occur in a program.
Automatic trigger generation for rule-based smart homes (PLAS 2016: ACM SIGPLAN Workshop on Programming Languages and Analysis for Security, 2016): User-written rules consist of triggers and actions. If the user forgets to write some triggers, then the rules may not fire as often as desired by the user, leading to undesired behavior or security errors. This paper describes an analysis to find omitted trigger conditions.
Checking conformance of applications against GUI policies (ESEC/FSE 2021: The ACM 29th joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2021): Deceptive GUIs can cause interactions that users do not intend, and badly designed GUIs can reduce usability. This paper shows how to analyze an Android app to ensure that it obeys GUI best practices.

Refactoring

Automated support for program refactoring using invariants (ICSM 2001: Proceedings of the International Conference on Software Maintenance, 2001): This paper shows how program invariants can productively be used to identify candidate refactorings, which are small-scale program transformations that improve program structure, performance, or other features.
Converting Java programs to use generic libraries (OOPSLA 2004, Object-Oriented Programming Systems, Languages, and Applications, 2004): When type parameters are added to library code, client code should be upgraded to supply parameters at each use of library classes. This paper presents a sound and precise combined pointer and type-based analysis that does so.
Refactoring for parameterizing Java classes (ICSE 2007, Proceedings of the 29th International Conference on Software Engineering, 2007): Programmers must change their programs to take advantage of generic types. This paper presents a type-constraint-based analysis that converts non-generic classes such as List into generic classes such as List<T>.
Refactoring sequential Java code for concurrency via concurrent libraries (ICSE 2009, Proceedings of the 31st International Conference on Software Engineering, 2009): This paper presents a refactoring tool, Concurrencer, that transforms sequential code into parallel code that uses the java.util.concurrent libraries.
Refactoring using type constraints (ACM Transactions on Programming Languages and Systems, 2011): A program is one solution to the type constraints that are imposed by its interfaces and use of libraries. Other solutions may be possible, and they represent legal refactiorings of the program.
How do programs become more concurrent? A story of program transformations (Proceedings of the 4th International Workshop on Multicore Software Engineering, 2011): This paper is a historical analysis of the transformations that programmers used to convert sequential programs into concurrent versions.

Defect prediction

Predicting problems caused by component upgrades (ESEC/FSE 2003: Proceedings of the 9th European Software Engineering Conference and the 11th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2003): A software upgrade may break a customer's system because of differences between it and the vendor's test environment. This paper shows how to predict such problems without having to integrate and test.
Finding latent code errors via machine learning over program executions (ICSE 2004, Proceedings of the 26th International Conference on Software Engineering, 2004): This paper shows the efficacy of a technique that performs machine learning over correct and incorrect programs, then uses the resulting models to identify latent errors in other programs.
Early identification of incompatibilities in multi-component upgrades (ECOOP 2004 — Object-Oriented Programming, 18th European Conference, 2004): This paper extends the technique of “Predicting problems caused by component upgrades” [ESEC/FSE 2003] to deal with multi-module upgrades and arbitrary calling patterns. It also presents 4 other enhancements and a case study of upgrading the C standard library as used by 48 Unix programs.
Formalizing lightweight verification of software component composition (SAVCBS 2004: Specification and Verification of Component-Based Systems, 2004): This paper formalizes (and corrects a few mistakes in) previous work on predicting problems caused by component upgrades. It presents an outline for a proof that the technique permits only sound upgrades.
Prioritizing warnings by analyzing software history (MSR 2007: International Workshop on Mining Software Repositories, 2007): Bug-finding tools prioritize their warning messages, but often do so poorly. This paper gives a new prioritization, based on which problems programmers have fixed most quickly in the past — those are likely to be the most important ones.
Which warnings should I fix first? (ESEC/FSE 2007: Proceedings of the 11th European Software Engineering Conference and the 15th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007): This paper determines how a bug-finding tool should prioritize its warnings. The prioritization is computed from the bug fix history of the program — the problems programmers considered important enough to fix.
CBCD: Cloned Buggy Code Detector (ICSE 2012, Proceedings of the 34th International Conference on Software Engineering, 2012): When programmers clone code, they also clone bugs. This paper presents a technique to find clones of just-fixed bugs, so the clones can also be fixed.
Proactive detection of inadequate diagnostic messages for software configuration errors (ISSTA 2015, Proceedings of the 2015 International Symposium on Software Testing and Analysis, 2015): ConfDiagDetector detects inadequate diagnostic messages for configuration errors. It works by injecting misconfigurations and using natural language processing to analyze the resulting diagnostic messages.

Test generation

Improving test suites via operational abstraction (ICSE 2003, Proceedings of the 25th International Conference on Software Engineering, 2003): This paper proposes a technique for selecting test cases that is similar to structural code coverage techniques, but operates in the semantic domain of program behavior rather than in the lexical domain of program text. The technique outperforms branch coverage in test suite size and in fault detection.
Eclat: Automatic generation and classification of test inputs (ECOOP 2005 — Object-Oriented Programming, 19th European Conference, 2005): This paper presents an automatic mechanism for selecting tests that are likely to expose errors — tests whose run-time behavior is maximally different from succeeding runs. The paper also gives techniques for test input generation and for converting a test input into a test case.
An empirical comparison of automated generation and classification techniques for object-oriented unit testing (ASE 2006: Proceedings of the 21st Annual International Conference on Automated Software Engineering, 2006): This paper experimentally evaluates four test generation strategies: the four combinations of model-based vs. exception-based, and symbolic vs. random. The model-based symbolic combination is new.
Finding the needles in the haystack: Generating legal test inputs for object-oriented programs (M-TOOS: 1st Workshop on Model-Based Testing and Object-Oriented Systems, 2006): An automatically inferred model of legal method call sequences can bias test generation toward legal method sequences, increasing coverage and creating data structures beyond the ability of undirected random generation.
Feedback-directed random test generation (ICSE 2007, Proceedings of the 29th International Conference on Software Engineering, 2007): This paper presents a technique that improves random test generation by incorporating feedback from executing previously-generated inputs. The technique found serious, previously-unknown errors in widely-deployed applications.
Automatic creation of SQL injection and cross-site scripting attacks (ICSE 2009, Proceedings of the 31st International Conference on Software Engineering, 2009): By generating inputs that cause the program to execute particular statements, then modifying those inputs into attack vectors, it is possible to prove the presence of real, exploitable security vulnerabilities in PHP programs.
Finding bugs in web applications using dynamic test generation and explicit state model checking (IEEE Transactions on Software Engineering, 2010): This paper extends dynamic test generation, based on combined concrete and symbolic execution, to the new domain of web applications, and finds 673 bugs in 6 PHP web applications.
Combined static and dynamic automated test generation (ISSTA 2011, Proceedings of the 2011 International Symposium on Software Testing and Analysis, 2011): The Palus tool performs constraint-based random test generations, using constraints learned from dynamic executions and from static analysis of method coupling. Palus outperforms all other known test generation approaches, and found real bugs in well-tested commercial software.
Scaling up automated test generation: Automatically generating maintainable regression unit tests for programs (ASE 2011: Proceedings of the 26th Annual International Conference on Automated Software Engineering, 2011): This paper shows how to apply automatic test generation to the domain of programs (not just libraries), and how to make the resulting test suite maintainable (easy to understand and to adapt to changes in the program).
Automatic generation of oracles for exceptional behaviors (ISSTA 2016, Proceedings of the 2016 International Symposium on Software Testing and Analysis, 2016): This paper gives a technique that converts English documentation about thrown exceptions (say, expressed in Javadoc) into assertions that determine whether a program throws the correct exceptions.
Translating code comments to procedure specifications (ISSTA 2018, Proceedings of the 2018 International Symposium on Software Testing and Analysis, 2018): This paper gives an improved technique that converts English Javadoc procedure documentation (preconditions, postconditions, exceptions) into executable expresssions that can be used as assertions.

Testing

Reducing wasted development time via continuous testing (ISSRE 2003: Fourteenth International Symposium on Software Reliability Engineering, 2003): Early notification of problems enables cheaper fixes. This paper evaluates how much developer time could be saved by continuous testing, which uses extra CPU cycles to continuously run tests in the background.
Continuous testing in Eclipse (2nd Eclipse Technology Exchange Workshop (eTX), 2004): This paper describes the design principles, user interface, architecture, and implementation of a publicly-available continuous testing plug-in for the Eclipse development environment.
Automatic mock object creation for test factoring (PASTE 2004: ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE'04), 2004): A set of small, fast-running tests can be more useful than a single larger test. This paper proposes a way to automatically factor a large test case into smaller tests, using mock objects to model the environment.
An experimental evaluation of continuous testing during development (ISSTA 2004, Proceedings of the 2004 International Symposium on Software Testing and Analysis, 2004): Continuous testing during development provides early notification of errors. This paper reports a controlled experiment to measure whether and how programmers benefit from continuous testing.
Automatic test factoring for Java (ASE 2005: Proceedings of the 20th Annual International Conference on Automated Software Engineering, 2005): Test factoring creates fast, focused unit tests from slow system-wide tests. Each new unit test exercises only a subset of the functionality exercised by the system test.
Theories in practice: Easy-to-write specifications that catch bugs (TR, 2008): A “theory” is a claim about one or more objects — an executable predicate that generalizes over a (possibly infinite) set of tests. Theories serve as a type of specification that can complement or supersede individual tests.
ReCrash: Making software failures reproducible by preserving object states (ECOOP 2008 — Object-Oriented Programming, 22nd European Conference, 2008): ReCrash is a lightweight technique that monitors a program for failures such as crashes. When a failure occurs in the field, ReCrash creates multiple unit tests that reproduce it. This eases debugging and fixing the errors.
Automated documentation inference to explain failed tests (ASE 2011: Proceedings of the 26th Annual International Conference on Automated Software Engineering, 2011): When a test fails, debugging begins. The FailureDoc tool indicates which parts of a failed test are most relevant to the failure, which helps programmers to understand and fix the failure.
Automated diagnosis of software configuration errors (ICSE 2013, Proceedings of the 35th International Conference on Software Engineering, 2013): Sometimes, incorrect software behavior is not due to a bug, but to supplying an incorrect configuration option. The ConfDiagnoser tool suggests a configuration option to change so that the software behaves as desired.
Efficient mutation analysis by propagating and partitioning infected execution states (ISSTA 2014, Proceedings of the 2014 International Symposium on Software Testing and Analysis, 2014): Mutation analysis — a technique for evaluating test suites — is notoriously compute-intensive. This paper reduces its cost by 40%, by running the program being tested a single time in a prepass to determine which test results can be predicted and thus do not need to be run.
Empirically revisiting the test independence assumption (ISSTA 2014, Proceedings of the 2014 International Symposium on Software Testing and Analysis, 2014): A test is indepent if its result does not depend on what other tests have been run. Test prioritizaton and selection assume test independence. We show that both human-written and automatically-generated test suites violate the standard test independence assumption.
Defects4J: A database of existing faults to enable controlled testing studies for Java programs (ISSTA 2014, Proceedings of the 2014 International Symposium on Software Testing and Analysis, 2014): Defects4J is a database of real bugs from open source programs. Each bug is isolated and reproducible. This database enables researchers to evaluate their techniques on real-world bugs.
Are mutants a valid substitute for real faults in software testing? (FSE 2014: Proceedings of the ACM SIGSOFT 22nd Symposium on the Foundations of Software Engineering, 2014): Much research in software testing uses mutants (artificial defects) in place of real ones. We investigate whether this assumption is warranted.
When tests collide: Evaluating and coping with the impact of test dependence (TR, 2015): This paper shows that test prioritization, selection, and parallelization yield incorrect results (test suites that spuriously fail) for real-world test suites, and gives an algorithm that makes those techniques correct.
Comparing developer-provided to user-provided tests for fault localization and automated program repair (ISSTA 2018, Proceedings of the 2018 International Symposium on Software Testing and Analysis, 2018): Fault localization experiments are usually performed using tests commited to found in version control systems. These may have been committed after the developer had investigated a fix. Using test cases from issue trackers, as would be the case in realistic practice, leads to worse fault localization results.
Dependent-test-aware regression testing techniques (ISSTA 2020, Proceedings of the 2020 International Symposium on Software Testing and Analysis, 2020): Dependent tests are tests that pass when run in one order but fail when run in another order. This paper presents a way to detect test dependencies, plus test prioritization, selection, and parallelization techniques that avoid violating the dependencies.
Revisiting the relationship between fault detection, test adequacy criteria, and test set size (ASE 2020: Proceedings of the 35th Annual International Conference on Automated Software Engineering, 2020): Test suite size has a big impact on test effectiveness. How should experiments control for test suite size? How do size and test adequacy (e.g., coverage) affect the fault detection power of a test suite?

Debugging

Interactive record/replay for web application debugging (UIST 2013: Proceedings of the 26th ACM Symposium on User Interface Software and Technology, 2013): This paper presents a record-replay infrastructure for web applications. It features lightweight recording, and its replay is compatible with debuggers and logging. This enables new developer debugging tools.
Which configuration option should I change? (ICSE 2014, Proceedings of the 36th International Conference on Software Engineering, 2014): A new version of a system may require different configuration options than a previous version did. This paper gives an automated technique for suggesting which configuration options to change. It outperforms previous approaches.
Evaluating and improving fault localization (ICSE 2017, Proceedings of the 39th International Conference on Software Engineering, 2017): Previously, fault localization techniques were evaluated on artificial faults, such as mutants. None of the previously-reported results holds for real faults, indicating that artificial faults should not be used to evaluate fault localization techniques.
An empirical study of fault localization families and their combinations (IEEE Transactions on Software Engineering, 2019): This paper evaluates multiple fault localization techniques on real-world faults, and indicates how to combine them to improve them further.

Program repair

Improving adaptability via program steering (ISSTA 2004, Proceedings of the 2004 International Symposium on Software Testing and Analysis, 2004): Program steering selects modalities for a program that may operate in several modes. This paper's experiments show that program steering can substantially improve performance in unanticipated situations.
Inference and enforcement of data structure consistency specifications (ISSTA 2006, Proceedings of the 2006 International Symposium on Software Testing and Analysis, 2006): This paper presents an automatic technique for recovering from data structure corruption errors. It involves inferring correctness constraints, then repairing any errors that occur at run time. The evaluation includes a hostile “Red Team” evaluation.
Automatically patching errors in deployed software (SOSP 2009, Proceedings of the 22nd ACM Symposium on Operating Systems Principles, 2009): The ClearView system automatically: classifies executions as normal or attack; learns a model of the normal executions; generates patches that correct deviations from the model during an attack; evaluates the patches; and distributes the best one. A hostile Red Team evaluation shows that ClearView is effective.
Automated diagnosis of software configuration errors (ICSE 2013, Proceedings of the 35th International Conference on Software Engineering, 2013): Sometimes, incorrect software behavior is not due to a bug, but to supplying an incorrect configuration option. The ConfDiagnoser tool suggests a configuration option to change so that the software behaves as desired.
Automatically repairing broken workflows for evolving GUI applications (ISSTA 2013, Proceedings of the 2013 International Symposium on Software Testing and Analysis, 2013): When a UI changes, a user's workflow must also change. The FlowFixer tool automatically translates an old workflow into one that works on the new UI.

Specification inference

Dynamically discovering pointer-based program invariants (TR, 1999): This paper extends dynamic invariant detection to pointer-directed data structures by linearizing those structures and, more significantly, by permitting inference of disjunctive invariants (like “A or B”).
Quickly detecting relevant program invariants (ICSE 2000, Proceedings of the 22nd International Conference on Software Engineering, 2000): Dynamic invariant detection can output many unhelpful properties as well as those of interest. This paper gives four techniques that eliminate uninteresting invariants or add relevant ones, thus increasing the technique's usefulness to programmers.
Dynamically Discovering Likely Program Invariants (Ph.D. dissertation, 2000): This dissertation overviews dynamic invariant detection as of summer 2000. It includes materials from the papers on the basic invariant detection techniques (IEEE TSE, 2001), relevance improvements (ICSE 2000), and extensions to pointer-based data structures (UW TR, 1999).
Dynamically discovering likely program invariants to support program evolution (IEEE Transactions on Software Engineering, 2001): Program properties (such as formal specifications or assert statements) are useful for a variety of programming tasks. This paper shows how to dynamically infer program properties by looking for patterns and relationships among values computed at run time.
Static verification of dynamically detected program invariants: Integrating Daikon and ESC/Java (RV 2001: Proceedings of the First Workshop on Runtime Verification, 2001): Dynamic invariant detection proposes likely (not certain) invariants; static verification requires program explicit identification of a goal and program annotation. This paper combines the two techniques to overcome the weaknesses of each.
Automated support for program refactoring using invariants (ICSM 2001: Proceedings of the International Conference on Software Maintenance, 2001): This paper shows how program invariants can productively be used to identify candidate refactorings, which are small-scale program transformations that improve program structure, performance, or other features.
Automatic generation of program specifications (ISSTA 2002, Proceedings of the 2002 International Symposium on Software Testing and Analysis, 2002): Sound program verifiers generally require program specifications, which are tedious and difficult to generate. A dynamic analysis can automatically produce unsound specifications. Combining the two techniques overcomes both weaknesses and demonstrates that the dynamic step, while unsound, can be quite accurate in practice.
Invariant inference for static checking: An empirical evaluation (FSE 2002, Proceedings of the ACM SIGSOFT 10th International Symposium on the Foundations of Software Engineering, 2002): Tool unsoundness and incompleteness may hinder users performing a task, or users may benefit even from imperfect output. In a study of program verification aided by dynamic invariant detection, even very poor output from the invariant detector aided users to a statistically significant degree.
Determining legal method call sequences in object interfaces (2003): Correct functioning of a component can depend on the component's methods being called in the correct order. This paper simplifies and improves on previous techniques for determining the legal call sequences.
Efficient incremental algorithms for dynamic detection of likely invariants (FSE 2004: Proceedings of the ACM SIGSOFT 12th Symposium on the Foundations of Software Engineering, 2004): This paper shows how to perform dynamic invariant detection both incrementally, efficiently, and without using only a trivial grammar; previous implementations suffered from one or more such problems.
The Daikon system for dynamic detection of likely invariants (Science of Computer Programming, 2007): This paper discusses the Daikon tool, including its features, applications, architecture, and development process. It is not a paper about dynamic invariant detection per se.
Leveraging existing instrumentation to automatically infer invariant-constrained models (ESEC/FSE 2011: The 8th joint meeting of the European Software Engineering Conference (ESEC) and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), 2011): Synoptic infers, from system logs, a finite state machine model of the system. Synoptic's novel inference algorithm efficiently leads to compact but accurate models. Programmers found Synoptic useful for bug detection and other tasks.
Bandsaw: Log-powered test scenario generation for distributed systems (SOSP WIP: Proceedings of the 23rd ACM Symposium on Operating Systems Principles, Work In Progress Track, 2011): This short paper proposes a way to automatically generate test scenarios (and eventually full tests), by mining existing logs, modeling the results, and generating new traces from the models.
Mining temporal invariants from partially ordered logs (SIGOPS Operating Systems Review, 2011): Totally ordered (linear) logs have been well-studied in the invariant mining community, but many real logs (e.g., from distributed systems) are partially ordered. This paper gives algorithms for mining invariants from partially ordered logs.
Case studies and tools for contract specifications (ICSE 2014, Proceedings of the 36th International Conference on Software Engineering, 2014): How do developers use Code Contracts in practice? How does developers' use of Code Contracts compare to true contracts that they could have written? How do developers react when shown the difference? This paper answers these questions.
Inferring models of concurrent systems from logs of their behavior with CSight (ICSE 2014, Proceedings of the 36th International Conference on Software Engineering, 2014): This paper provides a new approach to inferring a communicating finite state machine (CFSM) from executions of a concurrent system. The paper also proves its techniques correct and evaluates an implementation on system logs and a user study.
Using declarative specification to improve the understanding, extensibility, and comparison of model-inference algorithms (IEEE Transactions on Software Engineering, 2015): Model inference algorithms summarize a log to make it easier to understand. But the algorithm itself can be hard to understand, especially if specified procedurally. InvariMint gives a declarative and efficient way to specify model inference algorithms.

Dynamic analysis

WODA 2003: ICSE Workshop on Dynamic Analysis (2003): This workshop brought together practitioners and academics to discuss topics from the structure of the dynamic analysis field, to how to enable better and easier progress, to specific new analysis ideas.
Static and dynamic analysis: Synergy and duality (WODA 2003: Workshop on Dynamic Analysis, 2003): This position paper is intended to provoke thought and discussion regarding the relationship between static and dynamic analysis. The paper claims they are not as different as many have assumed.
Learning from executions: Dynamic analysis for software engineering and program understanding (2005): A broad overview of the field of dynamic analysis, including applications of dynamic analysis, dynamic analysis algorithms, and implementation details. The slides capture only part of the tutorial, and naturally the tutorial only captures part of this exciting field.
Detection of web service substitutability and composability (WS-MaTe: International Workshop on Web Services — Modeling and Testing, 2006): Programmers, end users, and even program would like to discover and reuse software services. This paper presents an approach to indicate when Web Services may be substituted for one another or composed.
Dynamic inference of abstract types (ISSTA 2006, Proceedings of the 2006 International Symposium on Software Testing and Analysis, 2006): This paper presents a dynamic analysis for computing abstract types, which indicate which values and variables may interact at run time. The paper also presents implementations and experiments for C++ and Java.

Speculative analysis

Speculative analysis: Exploring future development states of software (FoSER: Workshop on the Future of Software Engineering Research, 2010): This paper proposes that tools should analyze program versions that do not exist yet but that a developer might create. This may help the developer decide in advance whether or not to create that particular version.
Predicting development trajectories to prevent collaboration conflicts (FutureCSD 2012: The Future of Collaborative Software Development, 2012): Rather than dealing with source code conflicts when a developer notices, or using a tool to detect them as soon as they arise, it is better to predict and prevent them before they happen.
Speculative analysis of integrated development environment recommendations (OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications, 2012): An IDE lets a developer modify his or her program, but does not indicate the consequences of those changes. Our tool, Quick Fix Scout, shows the consequences of Eclipse's Quick Fix recommendations, in terms of compilation errors introduced or eliminated. It makes developers faster in eliminating compilation errors.
Early detection of collaboration conflicts and risks (IEEE Transactions on Software Engineering, 2013): The Crystal tool informs a developer when his individual work can be safely merged with his co-workers' changes, and when his individual work conflicts with co-workers, by speculatively performing version control system operations in the background.
Reducing feedback delay of software development tools via continuous analysis (IEEE Transactions on Software Engineering, 2015): An offline program analysis runs when the developer invokes it. A continuous analysis operates automatically on the current codebase and asynchronously informs the developer of results. By using a shadow copy of the codebase, an offline analysis can be made continuous.
Development history granularity transformations (ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, 2015): This paper shows how to transform a development history into multiple granularities that are appropriate for different software engineering tasks.

Software engineering

Panel: Perspectives on software engineering (ICSE 2001, Proceedings of the 23rd International Conference on Software Engineering, 2001): This talk argues that tools are key to having impact on software engineering; they must be published; case studies are more fruitful than experiments; researchers have the responsibility to target the state of the art; static and dynamic tools should be combined; and lightweight formal tools are the right place to start.
PASTE: ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (2005): PASTE 2005 brought together the program analysis, software tools, and software engineering communities to focus on applications of program analysis techniques in software tools.
Rethinking the economics of software engineering (FoSER: Workshop on the Future of Software Engineering Research, 2010): We propose to build tools that permit less-skilled workers to perform certain tasks currently done by developers. This frees developers to do tasks that only they can do, and overall makes software less expensive.
Reproducible tests? Non-duplicable results in testing and verification (ICST 2012: Fifth International Conference on Software Testing, Verification and Validation (ICST), 2012): This keynote talk is a call to arms to the community to treat research more like software: it deserves to be tested and replicable.
Defects4J: A database of existing faults to enable controlled testing studies for Java programs (ISSTA 2014, Proceedings of the 2014 International Symposium on Software Testing and Analysis, 2014): Defects4J is a database of real bugs from open source programs. Each bug is isolated and reproducible. This database enables researchers to evaluate their techniques on real-world bugs.

User interfaces

Finding errors in multithreaded GUI applications (ISSTA 2012, Proceedings of the 2012 International Symposium on Software Testing and Analysis, 2012): In a GUI program, only the main thread (event thread) should access GUI objects. Our analysis verifies this property, and found crashing errors in Android, Eclipse plugin, Swing, and SWT applications.
Reducing the barriers to writing verified specifications (OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications, 2012): We present a novel user interface that enables ordinary developers, who have not been trained in formal verification, to perform such tasks quickly and efficiently. It also decomposes verification tasks, enabling multiple developers to collaborate.
JavaUI: Effects for controlling UI object access (ECOOP 2013 — Object-Oriented Programming, 27th European Conference, 2013): In a GUI program, only the main thread (event thread) should access GUI objects. Our sound static analysis verifies this property through an effect system, and scaled to 140KLOC.
Automatically repairing broken workflows for evolving GUI applications (ISSTA 2013, Proceedings of the 2013 International Symposium on Software Testing and Analysis, 2013): When a UI changes, a user's workflow must also change. The FlowFixer tool automatically translates an old workflow into one that works on the new UI.
Interactive record/replay for web application debugging (UIST 2013: Proceedings of the 26th ACM Symposium on User Interface Software and Technology, 2013): This paper presents a record-replay infrastructure for web applications. It features lightweight recording, and its replay is compatible with debuggers and logging. This enables new developer debugging tools.
User scripting on Android using BladeDroid (APSys 2014: 5th Asia-Pacific Workshop on Systems, 2014): User scripting is common on the Web, but unknown for mobile applications. Our BladeDroid system enables users to add scripts to mobile apps. The implementation uses bytecode rewriting and dynamic class loading.
Explaining visual changes in web interfaces (UIST 2015: Proceedings of the 28th ACM Symposium on User Interface Software and Technology, 2015): The Scry tool enables a developer to navigate forward and backward in a web page's execution. When the developer selects a web page element, Scry shows how the browser state differed and what JavaScript code is responsible.
Verifying that web pages have accessible layout (PLDI 2018: Proceedings of the ACM SIGPLAN 2016 Conference on Programming Language Design and Implementation, 2018): The VizAssert tool proves that a web page's layout satisfies a usability or accessibility property, such as that text never overlaps or buttons are onscreen, regardless of screen size, font preferences, browser, etc.
Modular verification of web page layout (Proceedings of the ACM on Programming Languages, 2019): The Troika tool modularizes previous work on layout verification, splitting a monolithic verification problem into independent parts, one for each component of a web page.

Natural language processing

Automated documentation inference to explain failed tests (ASE 2011: Proceedings of the 26th Annual International Conference on Automated Software Engineering, 2011): When a test fails, debugging begins. The FailureDoc tool indicates which parts of a failed test are most relevant to the failure, which helps programmers to understand and fix the failure.
Proactive detection of inadequate diagnostic messages for software configuration errors (ISSTA 2015, Proceedings of the 2015 International Symposium on Software Testing and Analysis, 2015): ConfDiagDetector detects inadequate diagnostic messages for configuration errors. It works by injecting misconfigurations and using natural language processing to analyze the resulting diagnostic messages.
Ayudante: Identifying undesired variable interactions (WODA 2015: 13th International Workshop on Dynamic Analysis, 2015): Two different ways of identifying related variables are those that interact (such as being involved in the same + operation) and those with similar variable names. When these give different results, it can indicate a bug.
Automatic generation of oracles for exceptional behaviors (ISSTA 2016, Proceedings of the 2016 International Symposium on Software Testing and Analysis, 2016): This paper gives a technique that converts English documentation about thrown exceptions (say, expressed in Javadoc) into assertions that determine whether a program throws the correct exceptions.
Program synthesis from natural language using recurrent neural networks (TR, 2017): Tellina is a system that translates an English description of a file system operation into a bash command. A controlled user experiment shows that it helps programmers perform file system tasks more quickly.
Natural language is a programming language: Applying natural language processing to software development (SNAPL 2017: the 2nd Summit oN Advances in Programming Languages, 2017): Software developers embed much natural language (e.g., English) in their code, including in error messages, variable names, and comments. This paper advocates analyzing the natural language as well as the source code.
NL2Bash: A corpus and semantic parser for natural language interface to the Linux operating system (LREC: Language Resources and Evaluation Conference, 2018): NL2Bash is a dataset containing pairs of (English text, Bash commands). This dataset can be used in training and evaluating tools that translate from English to bash.
Translating code comments to procedure specifications (ISSTA 2018, Proceedings of the 2018 International Symposium on Software Testing and Analysis, 2018): This paper gives an improved technique that converts English Javadoc procedure documentation (preconditions, postconditions, exceptions) into executable expresssions that can be used as assertions.
Where should I comment my code? A dataset and model for predicting locations that need comments (ICSE NIER, Proceedings of the 42nd International Conference on Software Engineering, New Ideas and Emerging Results Track, 2020): This paper shows how to predict where programmers should write code comments. Its models achieved precision of 74% and recall of 13%.
MeMo: Automatically identifying metamorphic relations in Javadoc comments for test automation (Journal of Systems and Software, 2021): A metamorphic relation expresses that two expressions have the same value. MeMo translates natural-language documentation into metamorphic relations.
Call Me Maybe: Using NLP to automatically generate unit test cases respecting temporal constraints (ASE 2022: Proceedings of the 37th Annual International Conference on Automated Software Engineering, 2022): This paper presents an NLP technique that analyzes code comments to identify temporal constraints such as “the thread should be started after setting the daemon”, and it converts them into executable assertions.

Theory

ML typechecking is not efficient (Papers of the MIT ACM Undergraduate Conference, 1989): ML programmers have the intuition that ML's typechecking algorithm runs in linear time. This paper is an early description of the surprising result (not due to me) that ML typechecking runs in time much worse than exponential in the size of its input.
Adequate Models for Recursive Program Schemes (Bachelors thesis, 1989): This paper is a pedagogical exposition of adequacy for recursive program schemes in the monotone frame. Adequacy states that for any term of base type, the operational and denotational meanings are identical, and it is typically proved in the continuous frame.
Heraclitean encryption (TR, 1994): Heraclitean encryption permits an encryptor to create many independent decryption keys for converting a particular codetext into plaintext. This permits tracing of decryption keys and, in the event of a compromise, determination of which decryptor leaked his or her key.
Playing Konane mathematically: A combinatorial game-theoretic analysis (UMAP Journal, 1995): This paper presents a combinatorial game-theoretic analysis of Konane, an ancient Hawaiian stone-jumping game. Combinatorial game theory indicates, for a given board position, which player wins, and how great that player's advantage is.
Method and system for controlling unauthorized access to information distributed to users (1996): This patent describes a system with unique decryption keys for each purchaser of an encrypted product, so as to determine who revealed a decryption key if one is made public.
Graphs induced by Gray codes (Discrete Mathematics, 2002): A Gray code on n bits induces a graph over n vertices such that vertices i and j are adjacent when bit positions i and j flip consecutively during the code. This paper answers some questions about what sorts of graphs can (and cannot) be induced by Gray codes.

Education

The Groupthink specification exercise (ICSE Education and Training Track: Software Engineering Education in the Modern Age: Challenges and Possibilities, PostProceedings of ICSE '05 Education and Training Track, 2006): The Groupthink specification exercise is a fun group activity that teaches students about specifications, teamwork, and communication. This paper describes both the goals of the activity (along with an assessment of it), and the mechanics of how to run it.
Introductory programming meets the real world: Using real problems and data in CS1 (SIGCSE: Proceedings of the 45th ACM Technical Symposium on Computer Science Education, 2014): Most introductory programming classes base their examples and assignments on puzzles, games, and abstract math. We describe an approach that engages a different type of student. Each assignment uses a real-world dataset and answers a real question from science, engineering, business, etc.
A data programming CS1 course (SIGCSE: Proceedings of the 46th ACM Technical Symposium on Computer Science Education, 2015): This paper describes experience, at 4 colleges and universities, with an approach to teaching introductory programming in which each assignment uses a real-world dataset and answers a real question from science, engineering, business, etc.
Teaching rigorous distributed systems with efficient model checking (EuroSys, 2019): This paper introduces shows an effective way to teach distributed systems, with a set of labs, and a model checker and visual debugger used by students to verify and debug their solutions.

Miscellaneous

Image/map correspondence using curve matching (AAAI Spring Symposium on Robot Navigation, 1989): Correspondences between a real-world image and a map or terrain model can assist in navigation. This paper describes a technique to find such correspondences by focusing only on salient curves (roads, rivers, ridgelines, etc.) rather than the entire image.
Self-reference in English (1989): This paper shows that certain types of linguistic paradoxes cannot themselves be expressed without special linguistic conventions; this fact has since been dubbed “Ernst's Paradox”. This paper is recapitulated by “Quotational Ambiguity,” by George Boolos, in Proceedings of the Conference in Honor of W. V. O. Quine, San Marino, May 1990 (edited by Paulo Leonardi).
Intellectual property in computing: (How) should software be protected? An industry perspective (Memo, 1992): This was the first colloquium on intellectual property in computing to bring the debate to the technical and social realm; previous analyses had been from a narrow legal perspective. Computer industry executives discussed how "look and feel" user interface copyrights and software patents could affect the industry.
Automatic SAT-compilation of planning problems (IJCAI '97: IJCAI-97, Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, 1997): Planning problems can be solved by transformation into a propositional satisfiability problem, solution, and transformation of the SAT solution into a plan. This paper describes the first automatic translator from planning to SAT and evaluates the space of encodings of planning problems as SAT formulas.
The HaLoop approach to large-scale iterative data analysis (The VLDB Journal, 2012): An iterative job may repeat similar work on each iteration. This paper shows how to avoid repeating work in iterative MapReduce jobs, substantially improving performance.

Mobile applications

Finding errors in multithreaded GUI applications (ISSTA 2012, Proceedings of the 2012 International Symposium on Software Testing and Analysis, 2012): In a GUI program, only the main thread (event thread) should access GUI objects. Our analysis verifies this property, and found crashing errors in Android, Eclipse plugin, Swing, and SWT applications.
User scripting on Android using BladeDroid (APSys 2014: 5th Asia-Pacific Workshop on Systems, 2014): User scripting is common on the Web, but unknown for mobile applications. Our BladeDroid system enables users to add scripts to mobile apps. The implementation uses bytecode rewriting and dynamic class loading.
Collaborative verification of information flow for a high-assurance app store (CCS 2014: Proceedings of the 21st ACM Conference on Computer and Communications Security, 2014): This paper presents a practical, context-sensitive, flow-sensitive type system that verifies that an app satisfies an information flow policy. The system was effective in an adversarial Red Team evaluation.
Static analysis of implicit control flow: Resolving Java reflection and Android intents (ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, 2015): Standard program analysis understands control flow through direct procedure calls. This paper provides analyses that resolve uses of Java reflecton and the data passed in Android intents.
Boolean formulas for the static identification of injection attacks in Java (LPAR 2015: Proceedings of the 20th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, 2015): This paper describes a flow- and context-sensitive static analysis that automatically identifies where injections of tainted data can occur in a program.
Verifying that web pages have accessible layout (PLDI 2018: Proceedings of the ACM SIGPLAN 2016 Conference on Programming Language Design and Implementation, 2018): The VizAssert tool proves that a web page's layout satisfies a usability or accessibility property, such as that text never overlaps or buttons are onscreen, regardless of screen size, font preferences, browser, etc.
Checking conformance of applications against GUI policies (ESEC/FSE 2021: The ACM 29th joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2021): Deceptive GUIs can cause interactions that users do not intend, and badly designed GUIs can reduce usability. This paper shows how to analyze an Android app to ensure that it obeys GUI best practices.
Call graph soundness in Android static analysis (ISSTA 2024, Proceedings of the 2024 International Symposium on Software Testing and Analysis, 2024): We analyzed 13 static analyzers for Android code. They are very unsound: they failed to analyze 61%, on average, of the dynamically-executed methods. Tools were bad at discovering entry points, higher-precision tools were most unsound.

Michael Ernst