Pacific Northwest
Programming Languages and
Software Engineering Meeting 20l6

Noon-7pm, Tuesday March 15th, 2016
Gates Commons
Paul Allen Center for Computer Science & Engineering
University of Washington

Time	Event
12:00-12:20	Welcome and Lunch
Session 1: Tools and Synthesis (chair: Rishabh Singh)
12:20-12:40	Calvin Loncaric (UW) Data Structure Synthesis Click to toggle abstract. Many applications require specialized data structures not found in the standard libraries, but implementing new data structures by hand is tedious and error-prone. This paper presents a novel approach for synthesizing efficient implementations of complex collection data structures from high-level specifications that describe the desired retrieval operations. Our approach handles a wider range of data structures than previous work, including structures that maintain an order among their elements or have complex retrieval methods. We have prototyped our approach in a data structure synthesizer called Cozy. Four large, real-world case studies compare structures generated by Cozy against handwritten implementations in terms of correctness and performance.
12:40-1:00	Eric Larson (Seattle University) Generating Evil Test Strings for Regular Expressions Click to toggle abstract. Regular expressions are a powerful string processing tool. However, they are error-prone and receive little error checking from the compiler as most regular expressions are syntactically correct. This paper describes EGRET, a tool for generating evil test strings for regular expressions. EGRET focuses on common mistakes made by developers when creating regular expressions and develops test strings that expose these errors. EGRET has found errors in 284 out of 791 regular expressions. Prior approaches to test string generation have traversed all possible paths in the equivalent nondeterministic finite state automaton leading to the generation of too many strings. EGRET keeps the set of test strings to a manageable number: Fewer than 100 test strings were generated for 96% of the regular expressions; a manageable 307 test strings were generated for the most complex regular expression.
1:00-1:20	Michael Ernst (UW) Natural Language Processing Meets Software Testing Click to toggle abstract. Software is commonly thought of as a sequence of instructions that achieve some particular task and that can be formally analyzed. Software also consists of tests, executions, documentation, version control artifacts, issue trackers, and more. This talk discusses three projects that utilize natural language (English) that appears in or with source code. One technique uses variable names to discover when variables are used incorrectly in a program. One technique uses configuration mutation, then compares observed error messages to text in the user manual to determine whether the error message is adequate. One technique uses Javadoc specifications to create test oracles. Joint work with Juan Caballero, Alberto Goffi, Alessandra Gorla, Mauro Pezze, Irfan Ul Haq, and Sai Zhang.
1:20-1:40	Ivan Beschastnikh (UBC) Investigating Program Behavior with Texada, an LTL Specifications Miner Click to toggle abstract. Temporal specifications relate program behavior through time and are useful for software engineering tasks ranging from bug detection to program comprehension. Unfortunately, such specifications are often lacking from system descriptions, leading researchers to investigate methods for inferring these specifications from code, execution traces, code comments, and other artifacts. This demo will showcase Texada, a tool to infer temporal specifications from traces of program activity. Texada uses linear temporal logic (LTL) to express temporal properties and can mine LTL properties of arbitrary length and complexity. Texada is an open source tool: link.
1:40-2:00	Ben Livshits (MSR / UW) Just-In-Time Static Analysis Click to toggle abstract. Despite years if not decades of research and development on static analysis tools, industrial adaption of much of this tooling remains spotty. Some of this is due to familiar shortcomings with the tooling itself: the effect of false positives on developer satisfaction is well known. However, in this paper, we argue that static-analysis results often run against some cognitive barriers. In other words, the developer is not able to grasp the results easily, leading to higher abandonment rates for analysis tools. In this work, we propose to improve the current situation with the idea of Just-In-Time (JIT) analyses. In a JIT analysis, results are presented to the user in order of difficulty, starting with easy-to-fix warnings. These warnings are designed to gently ``train'' the developer and prepare them for reasoning about and fixing more complex bugs. The analysis itself is designed to operate in layers, so that the next layer of results is being computed while the previous one is being examined. The desired effect is that static-analysis results are available just-in-time, with the developer never needing to wait for them to be computed.
Break (2:00-2:30)
Session 2: New Languages (chair: Emina Torlak)
2:30-2:50	Andrew P Black (PSU) Teaching with Grace in Introductory Programming Classes Click to toggle abstract. Grace is a new, object-oriented educational programming language. Grace is motivated by frustration with available languages, none of which seems to be suited to our target audience: students in the first two programming courses. What principles should we apply to help us design such a language? We started with a list of 17 "obviously good principles", aware that some of them would conflict with each other. What we didn't expect was that some of them would conflict with good learning. One of our principles was that the language should provide one "fairly clear way" to do most things. But suppose that an instructor wants to use Grace to compare two ways of doing something? How can one show students the superiority of one approach over another if the alternative approach cannot be expressed? And yet we can hardly fill our language with every miss-begotten language feature of the last 50 years, just so that we can explain to our students why it is better not to program that way! Prof Black will outline the principle features of Grace, list the open issues, and listen to your reactions. The design of Grace is still evolving, so your input can make a difference.
2:50-3:10	Brad Chamberlain (Cray / UW) Chapel: Productive Parallel Programming from the Pacific Northwest Click to toggle abstract. Chapel (http://chapel.cray.com) is an emerging open-source language whose goal is to vastly improve the programmability of parallel systems while also enhancing generality and portability compared to conventional techniques. Chapel is seeing growing levels of interest not only among HPC users, but also in the data analytic, academic, and mainstream communities. Chapel’s design and implementation are portable and open-source, supporting a wide variety of compute platforms, from desktops (Mac, *nix) to commodity clusters, the cloud, and large-scale systems developed by Cray and other vendors. In this talk, I'll start by providing motivation and context for Chapel before giving a brief overview of its unique feature set. I'll also describe the status and organization of the Chapel project itself, highlighting opportunities for collaboration.
3:10-3:30	Vijay Menon (Google Seattle) Dart: Experiences compiling to JavaScript Click to toggle abstract. Dart is a new programming language developed at Google for large scale client applications. Today, it is used by a number of large applications running on modern web browsers via compilation to JavaScript. Dart has been designed as a dynamic language with an optional, but unsound static type system. In this talk, we will discuss our experiences and challenges with respect to compilation to JavaScript, a very different dynamic language. In particular, we'll focus on the challenges of doing this at scale on large applications. We will also discuss our current work on "strong" Dart, a sound subset of standard Dart. We will discuss the tradeoffs between soundness and unsoundness on both compiler optimization and the overall developer experience, and discuss the techniques used to achieve soundness while maintaining semantic compatibility with standard Dart.
3:30-3:50	Eric Walkingshaw (OSU) Choice Calculus & Variational Programming Click to toggle abstract. In this talk I'll briefly introduce the choice calculus, a formal language that underlies a lot of recent and ongoing work in the PL group at Oregon State on representing, typing, and computing with variation. If time permits, I'll also discuss our current work on writing programs that construct and manipulate variation directly.
Break (3:50-4:20)
Session 3: Analysis and Formal Methods (chair: Ben Zorn)
4:20-4:40	Shuvendu Lahiri (MSR) Differential Program Verification: Leveraging and Extending Program Verification for Reasoning about Program Differences Click to toggle abstract. I will outline the opportunities and challenges for leveraging program verification for soundly certifying properties over program changes, and its various applications in the software development cycle. I will discuss several projects around differential specification and verification related to assertion safety , termination, concurrency-safety for two program versions and conflict-freedom for 3-way merges. I will argue how differential verification naturally deals with the need for specifications, environment modeling and complex program-specific invariants inherent in program verification. I will show how proofs of such differential properties can leverage advances in program verification including VC generation, SMT-based checking and automatic invariant inference. I will also outline the challenges that come with taking program verification to this novel direction. I hope to conclude by convincing the audience that differential verification may be one (perhaps the only) cost-effective way for program verifiers to reach every legacy developer's toolbox!
4:40-5:00	Francesco Logozzo (Facebook) Static Analysis for Security at Facebook scale Click to toggle abstract. Facebook enormously care about security and privacy of its enormous code base. In the talk I will describe Zoncolan, our abstract interpretation-based static analyzer. Zoncolan precisely analyzes the codebase in less than 20 minutes. To achieve it, we had to achieve major breakthrough in designing abstract domains for security and in engineering static analyzers.
5:00-5:20	Kartik Chandra (Henry M. Gunn High School) Verification of Type Systems via Symbolic Execution Click to toggle abstract. A common error in computer programs is to attempt an operation on the wrong data type (such as trying to multiply a number and a string). Such errors can cause the program to crash, and some can be exploited by hackers. So, modern programming languages such as Java come with typecheckers that ensure that programs are well-typed before executing them. However, if the typechecker itself is flawed, subtle type errors can creep through. Unfortunately, verifying that a typechecker is "correct" (i.e. it rejects all programs that are not well-typed) is difficult and involves tedious manual proofs. In this talk, we present early results on automatically checking for the correctness of a typechecker using symbolic execution.
5:20-5:40	Xi Wang (UW) Towards a Formally Verified Storage Stack Click to toggle abstract. Applications depend on persistent storage to recover state after system crashes. But POSIX file system interfaces do not define the possible outcomes of a crash, leading to unexpected application behavior and catastrophic data loss. We proposed crash-consistency models, analogous to memory consistency models, to describe the behavior of a file system across crashes. We developed a formal framework for specifying crash-consistency models, and a toolkit, called Ferrite, for validating those models against real file system implementations. Embedding Ferrite models in Rosette, we prototyped proof-of-concept tools for verification and synthesis of crash-safe applications.
Reception (starting at 5:40) and Lightning Talks (starting at 6:00, chair: Zach Tatlock)
6:00-6:10	Alvin Cheung (UW) Semantic Software Adaptation Using Verified Lifting Click to toggle abstract. Software developers often need to adapt existing software to new architectures and frameworks. Unfortunately, doing so is a tedious and error-prone process as developers often need to reverse-engineer the semantics of potentially heavily-optimized code, and build compilers that translates from source to target language. In this talk, I will introduce the concept of verified lifting, and discuss how we have used this technique to automatically adapt software in various domains, using examples from databases, stencils, parallel processing frameworks, and programmable hardware.
6:10-6:20	K. Rustan M. Leino (MSR) Modeling program semantics in an automated program verifier Click to toggle abstract. There are benefits to defining programming language semantics formally, for example when proving the soundness of type systems, reasoning about optimizations, or verifying the correctness of compilers. Program semantics is usually cast in terms of predicates defined as least or greatest fixpoints. Is it possible for such developments to benefit from the power of semi-decision procedures? Yes. In this talk, I will show how the Dafny system provides constructs to define and prove theorems about such predicates. Under the hood, the Dafny verifier makes use of a fixpoint-unaware, induction-unaware, first-order SMT solver.
6:20-6:30	Rishabh Singh (MSR) BlinkFill: Semi-supervised Data Transformation by Examples Click to toggle abstract. Programming By Example techniques have shown great promise for enabling end-users to perform data transformation tasks using input-output examples. Since examples are often an under-specification, there are typically a large number of hypotheses conforming to the examples, and the PBE techniques need to devise complex algorithms to deal with this inherent ambiguity for finding the intended program amongst them. We present a semi-supervised learning technique to significantly reduce this ambiguity by using the logical information present in the input data to guide the synthesis algorithm. We will present a demo of the BlinkFill system that is several orders of magnitude faster in learning data transformations from examples as compared to previous systems.
6:30-6:40	Rich LeBlanc (Seattle University) Using the CycleFree Methodology to construct correct concurrent programs Click to toggle abstract. This talk will be a look back at some old work, essentially a product of the 1990's. It will describe a programming methodology for building real-time systems that allows the construction of concurrent programs without the explicit creation and synchronization of threads. The approach requires a program to have an acyclic invocation structure. It depends on an underlying CycleFree Kernel to implicitly schedule units of concurrency and synchronize access to objects. I would like to find collaborators who are interesting in exploring how these ideas might fit in the contemporary programming world.
6:40-6:50	Rob DeLine (MSR) Live Programming for Live Data Click to toggle abstract. Data analysis is becoming a core skill on many software teams to support analytics and data-centered applications like wearables and IoT. The Tempe environment provides a live scripting experience for exploring and analyzing live data.
6:50-7:00	Ben Zorn (MSR) Disruptive Technology and the Future of Programming Language Research Click to toggle abstract. The last 5 years have seen disruptive impact in a number of computer science disciplines, including machine learning, data analysis, and human/computer interaction. In the same period, programming language research has maintained a steady pace but it has seen less dramatic impact on the core of computer science or applications built on top of it. Increasingly, the interaction between core computer science and specific verticals, such as health care, manufacturing, or smart cities, has driven large amounts of commercial investment based on the anticipation that the combination will result in major business opportunities and societal impact. In this talk, I argue that programming language researchers should more actively seek collaborations in specific vertical domains to have the greatest impact with their research. To facilitate this shift, as a discipline, we also have to encourage cross-disciplinary efforts and support research that combines multiple disciplines to achieve a greater goal.
7:00	Adjourn

Pacific Northwest Programming Languages and Software Engineering Meeting 20l6 Noon-7pm, Tuesday March 15th, 2016 Gates Commons Paul Allen Center for Computer Science & Engineering University of Washington

Organizers

Pacific Northwest
Programming Languages and
Software Engineering Meeting 20l6

Noon-7pm, Tuesday March 15th, 2016
Gates Commons
Paul Allen Center for Computer Science & Engineering
University of Washington