CS294-2: Software Synthesis
Spring 2006
Time and Place: Tue and Thu 9:30-11:00, 310 Soda
Units: 3
Recommended background: CS 164 or equivalent
Instructor: Rastislav Bodik, 773 Soda
Office hours: Tue 11-12, Thu 3-4
Course overview: Programmers would love to have their code synthesized from a concise specification, and computer science has been trying to satisfy their wishes for three decades. While exciting theoretical results exists and successful tools for specific domains have emerged, synthesis has not yet entered the mainstream. The purpose of the course is to understand the reasons and to synthesize a research direction for software synthesis that suits today's programmers and processes. We'll base our discussions on lessons from old classics, on successes and failures of past synthesis tools, and promises of recent technologies, some achieved here in Berkeley. Our view of code synthesis will be broad, spanning from deductive synthesis to genetic programming (see the topics below). Similarly, we'll cover a range of applications, from high-performance computing, object-oriented programming, API programming, assembly-level programming, to agent programming.
Intended audience: Graduate students in EECS. Seniors with interest in programming systems are encouraged to enroll. The course will cover diverse code synthesis technology and applications, and hence may be of interest not only to students in programming languages, but also in scientific computing, graphics, CAD, and other areas.
Student workload: Reading assigned papers and participating in class discussions. Presenting one paper in class (undergraduates may choose to present a demo of a tool). Project (literature review, novel algorithm design, or implementation).
Lecture format: Each lecture will discuss a paper, with active participation from students. Project presentations. Some lectures will be presented by guest speakers.
Initial list of topics and papers (under construction). To be refined according to student interests:
- Deductive software synthesis: the proof is the program
- Transformational synthesis
- Program differentiation
- Superoptimizers:
a search for the best assembly code
sequence
- Programming by demonstration, scenarios, examples:
- Synthesis with partial programs:
- Scientific computing
- Schema-based synthesis:
- Object-oriented programming, components:
- Genetic algorithms:
Papers suggested by students:
- Genetic algorithms for solving heuristic compiler optimization problems: paper on using machine learning to find a good inlining heuristic (more, more) --Manu
- Automated tuning. Self Adapting Linear Algebra Algorithms and Software: This is a long paper that talks about how to create linear algebra
software that during installation examines properties of the machine it's
on and based on what it sees chooses which algorithms to make use of.
This is a long paper, and perhaps it might be best to focus on just one or
two major sections of it though. These sections talk about
1) Creating dense numerical linear algebra libraries by examining machine
properties during install-time and selecting the algorithms to install
based on the observed properties.
2) The more complicated process of doing something similar for sparse
numerical linear algebra libraries.
3) Using statistical learning techniques to pick the algorithms for linear
algebra libraries.
This fits under the synthesis umbrella because while the space of
available algorithms might be known (and thus no new code is
"synthesized"), the particular algorithms that would be best on a
particular platform vary from platform to platform. This means that on
each machine it's installed on, the final library is in effect synthesized
from a "grab-bag" of available algorithms during the installation process. --Hormozd
- Inductive logic programming: given
a bunch of example sentences in some logical language such as Prolog,
find a logic program that 'explains' those sentences; relevant, e.g. to programming by demonstration, paper ---Bhaskara
- Web service synthesis: Automated Synthesis of Executable Web Service Compositions from BPEL4WS Processes.
From the Service Oriented Architecture point of view, any modular web service can be [potentially produced with] automated synthesis. This paper discusses one example of such process. Semantic Web Service Composition via Logic-based Program Synthesis
This paper goes more in depth on what defines a web service, what challenges come with the increasing number of web services, and one of the solutions as an approach to web service composition via logic-based program synthesis. --Cindy
- Synthesis for numerical kernels and DSP algorithms. The SPIRAL project at CMU, overview paper --Amir
- Automatic parallelization: parallelization is a form of software
synthesis: the programmer supplies a sequential program with annotations
on data layout and synchronization requirements, and the compiler
produces a parallel SPMD program with the appropriate data partitioning
and communication calls for the compilation target. Compared to other
parallel programming methodologies, data parallel languages have a
significant advantage in productivity, since programmers are freed from
the tedious and often error prone tasks of communication code
generation. Unfortunately, since the overhead of a remote access is
typically orders of magnitude higher than a local access on today’s
clusters, a naive parallelization strategy would result in a large
number of small messages and unacceptable performance. Thus, compiler
writers for data parallel languages invested significant efforts in
communication optimizations such as vectorization and pipelining. With
the advent of multi-core architectures, there have also been renewed
interests in data parallel paradigms such as HPF and OpenMP; the
hardware is already available so you might as well let compilers try to
automatically take advantages of it, and the cost of communication is
generally much lower for such architectures, which simplifies the task
of parallelization. So, I think a paper on data parallel languages and
their parallelization techniques may be interesting for the class, even
if it’s a bit dated. Compiling Fortran D for MIMD Distributed-Memory Machines --Wei
- Meta programming: I have chosen the following paper that I believe is relevant to software
synthesis:
Sheard, T. and Jones, S. P. 2002. Template meta-programming for Haskell.
In Proceedings of the 2002 ACM SIGPLAN Workshop on Haskell (Pittsburgh,
Pennsylvania). 1-16. (more) Meta-programming refers to techniques that allows programmers to
manipulate program code as data, so you can, for example, do some computation at compile-time to produce the program to be executed. This
technique might be used to specialize code for efficiency reasons (like in
partial evaluation) or for overcoming expressibility issues in
statically-typed languages. I view this technique as the providing the
ability to define high-level DSLs (embedded in a general purpose "host"
language) and describe how to (efficiently) synthesize/compile that DSL. I chose this paper over others in "Meta-programming" because it is fairly
recent with a reasonably detailed related work section comparing with the
long-standing quotation system in Lisp/Scheme and even some discussion on
C++ templates. --Evan
- Synthesis from
temporal logic specifications. paper 1 This seems like a fundamental piece on successful synthesis of
reactive systems based on temporal logic specifications; it also
seems to discuss algorithmic bounds on the synthesis problem for
such scenarios. (I found it pretty interesting following Amir
Pnueli's talk at VMCAI, in which he hinted towards the question of
synthesis -- using TL specs, of course -- versus design (UML,
etc)... This is a more recent work, still concerning synthesis from TL
specs, yet incorporating more interesting properties such as
faultiness and tolerance. --Gilad
- synthesis for the development of code for embedded systems: 1. R.K. Gupta and G. De Micheli, "Hardware-software Co-synthesis for Digital Systems": Introduces the use of "co-synthesis" for systems for which one wishes to synthesize code for hardware-software integrated platforms.
2. S. Parameswaran, "HW-SW Co-Synthesis: The Present and The Future": Gives an overview of co-synthesis for large hardware/software systems.
3. S. Srinivasan and N. K. Jha, "Hardware-Software Co-Synthesis of Fault-Tolerant Real-Time Distributed Embedded Systems": Addresses the problem of automatic hardware-software co-synthesis of fault tolerant, distributed systems.
4. A. Ledeczi, et.al., "The Generic Modeling Environment": A tool that supports creation of domain-specific modeling and program synthesis environments based on "metamodeling" concepts. Tool --Mark
- Embedded software. Here is another paper:
"Synthesis of software programs for embedded control applications". This work was part of POLIS project (done by CAD group here). Basically, they used S-graphs to model the specification of control applications, and used BDDs(binary decision diagrams) to optimize the S-graphs. Then software code was generated from optimized S-graphs. So I think it is synthesis of embedded software by taking advantage of restricted specifications(a particular type of control applications).
Another interesting work from CAD group here.
This was done by Prof. Lee's group. They did software synthesis for those applications which can be modeled as dataflow graphs. Lots of multimedia applications fit into this category. -- Qi
- Synthesis and simulation of digital systems containing interacting
hardware and software components. Paper . This paper broadly covers synthesis for both hardware and software
systems. PL
students will gain knowledge on hardware systems, and EE students will
learn software code synthesis. It also talks about simulation, which usually means to execute a
prototype, concerning more about correctness than performance. --Thomas
- Tool adoption. DSP
software synthesis. A very important problem is why would a real user
choose to adopt a new tool or not. It seems like a tool that
complements a user's existing workflow has a much better chance of
becoming mainstream than one that drastically changes it. One example
for this is in DSP development. DSP developers already have the habit
of using Matlab to simulate their designs before manually coding the
implementation in C. Synthesis tools that can take Matlab or Simulink
and generate C code automatically have started to get traction in the
DSP community. --Jimmy
-
DSL's: JTS: Tools for Implementing Domain-Specific Languages (
http://citeseer.ist.psu.edu/171171.html) It describes an extensible superset of Java called Jak and a compiler generator named Bali. Together they can process DSLs specified by the user (i.e. can be used to compile Java code that has been extended with arbitrary augmentations). --Liviu
Links to further candidate papers: