IMPLICIT PROGRAMMING
Implicit programming is a proposed software development paradigm that aims to address longstanding bottlenecks of software construction. The problem that we address is obvious to anyone who has ever written a program: programming is harder than it should be. Although programming activities include some seemingly unavoidable aspects, such as describing the goal of the computation and certain creative aspects of its solution, most of it is tedious and low level. It requires explicitly spelling out and elaborating conceptually simple steps, and ends up requiring more effort than one initially expects. As a result, even easy tasks require the work of professionals and experts, whose programming skills are in stark contrast to the remaining users of computing infrastructure. In the decade with billions of users of computing resources, blurring this contrast can have far-reaching economic and social consequences. Our implicit programming paradigm aims to blur this divide by making software construction substantially easier at several levels, from new declarative programming language constructs to new software development tools. We support implicit programming with a new concept of synthesis procedures, which enhance algorithms in today's compilers by building on the advances in satisfiability modulo theories (SMT) and the rapidly expanding field of software synthesis. We aim to deploy our algorithms through new run-time platforms that perform program analysis and optimization to support and leverage synthesis procedures in the context of the overall application. Finally, we are introducing new development tools, application manipulation interfaces, and techniques to handle ambiguity, making development easier for both experts and non-experts.
Outline of Results
We have developed synthesis and repair techniques and evaluated them in the Leon system (http://leon.epfl.ch). This is a core activity of the project. It is led by Etienne Kneuss, with a number of students contributing. Leon system supports constructing programs from specifications alone, or repairing a program to ensure that it adheres to a given specification. It uses synthesis rules, derived in part from synthesis procedures for integer linear arithmetic (such as those presented in our earlier work on complete functional synthesis) and for algebraic data types, thus extending the domain of synthesis to programs that manipulate data structures. Some of the foundations of this work were laid down in the VMCAI 2013 paper on Reductions for Synthesis Procedures. In addition to the rules derived from synthesis procedures in our main publication in OOPSLA 2013 we introduce rules for recursion schemas and a new symbolic form of syntax-guided synthesis, which enables our system to support synthesis of a large class of functions. Leon's web interface supports interactive application of rules, though we emphasize automated search of rule applications. We were able to automatically synthesize recursive functions such as sorting and data structure operations, without having to bias the system towards a particular application domain. Moreover, Etienne Kneuss also deployed run-time constraint solving into Leon's built-in compiler, which enables execution of complex specifications without the need to synthesize them; a feature that is useful for prototyping and debugging specifications. This feature also works well with some of synthesis rules: it is possible to partly solve a difficult synthesis problem, synthesizing some of the cases while leaving some of the cases for run-time constraint solving (see the RV 2013 invited talk).
A notable example of manipulations of data arises in data bases. In collaboration with Prof. Christoph Koch at EPFL, we have also explored (SIGMOD 2013) transformation-based approach for synthesis, in which we have automatically derived out-of-core algorithms such as database join and external sort. This approach involved a design of domain-specific language of nested list and a cost-estimation approach for programs in this language, as well as a set of transformations that can improve efficiency of data manipulation.
Also contributing to WP1, Eva Darulova has explored synthesis for numerical domains (POPL 2014). Eva developed an approach to synthesize floating point and fixed-point computations from real-valued function specifications. This development required an advance in automatic analysis of roundoff errors in programs, after which one can select the data representation and synthesis code that that meets the provided precision requirements. The analysis advance is based on the use of SMT solvers supporting non-linear arithmetic, as well as a custom domain for keeping track of roundoff errors. We developed a new technique for computing errors across different branches, providing first practical sound guarantees for end-to-end roundoff errors in conditional programs. We also presented a technique for computing errors after a large number of iterations of simple loops using matrix exponentiation. As a result, it is now possible for the first time to write real-valued computation and a precision requirement and, in a number of non-trivial cases, obtain provably correct implementation in floating point or fixed point arithmetic. This bridges a major gap in our ability to construct provably correct code that runs using available efficient numerical data types.
We have also pursued activities that contribute to WP2, putting synthesis into the framework that includes verification, as well as synthesis of program fragments in a larger program context. In the work with Etienne Kneuss and Emmanouil Koukoutos we have identified the notion of program repair as one of the most promising paths for introducing synthesis into larger scale software development projects. A repair request comes in the situation where an error has been identified in a program, which provides an immediate motivation for the developers to apply the technique. Furthermore, the repair scenario helps scalability of synthesis because the original program provides a hint to search for a correct implementation in the vicinity of the original one. Our system, presented at CAV 2013, is able to repair programs that include complex manipulation of syntax trees, including, for example, desugaring phase of a simple compilation pass. Another approach we used to improve scalability is the introduction of symbolic patterns to specify partial behavior functions. Symbolic patterns are natural to use to specify function behavior in certain cases, and can also be used very efficiently to generate many concrete tests. Concrete input/output tests speed up synthesis, reducing the number of invocations of verification steps, without giving up on soundness of solutions. At the same time, we continue to improve the capabilities and robustness of the underlying verification engine, because verification is necessary for successful synthesis of correct solutions. We have incorporated additional SMT solvers into the system, including CVC4, which has particular strengths in reasoning about algebraic data types, quantifiers, and mathematical induction (VMCAI 2013). Emmanouil Koukoutos has also developed an approach that combines static verification with run-time checking. Together, these approaches are a step towards making construction of verified software practical.
In the context of WP3, with Tihomir Gvero, Ivan Kuraj, and Ruzica Piskac, we have pursued code fragment synthesis based on statistics derived from text corpus. This solution, published in PLDI 2013, is the first one to establish a principled connection between type inhabitation approach to synthesis that comes from constructive logic and statistical analysis of corpus of code. A number of efficiency optimizations make this approach well performing even in the context of thousands of library declarations that describe building blocks of expressions. What is crucial is that this approach computes frequencies of uses of functions from libraries and generates expressions ranked according to the likelihood of declarations in them. Our evaluation has found that in a number of cases our approach synthesizes the expected expressions for a given program context.
Tihomir Gvero has recently pushed this approach further by computing a more detailed probabilistic context-free grammar model of Java expressions learned from a large corpus of Java programs and then using it to generate expressions that correspond to an arbitrary free form query given by the user. The free form text may contain program fragments as well as English text segments. Our tool uses a modified natural language parser to extract information from the input text and then combines it with the probabilistic context-free grammar model to construct Java expressions. The result is a unique system that bridges the gap between natural language and meaningful Java expressions, and brings development tools to a new level.
Runtime application customization has been pursued by Mikael Mayer. Mikael has developed an approach for programming by demonstration for 2D games on a touch-enabled tablet or phone (Onward! 2013). The system is available as an application named “Pong Designer” from Google Play store for Android OS devices. The system enables a user to run and interact with a 2D game on a touch-enabled tablet, then pause the execution, rewind to a past point in the execution and examine the events that took place. Subsequently, the user can demonstrate that certain input actions should have resulted in the desired outcome. The system accepts a number of such demonstrations and synthesizes from them rule for the game engine. Using this approach it is possible to program a number of 2D games, specifying not only the graphical elements but also their behavior.
Selected Outputs
Leon tool:
Videos:
- anyCode tool for suggestion code fragments, by Tihomir Gvero
- Synthesis using Leon, by Etienne Kneuss
- Viktor's Inaugural Lecture at EPFL (uses flash)
The following are some of the related papers that explore these techniques and tools and describe our results so far:
- Software Synthesis Procedures (CACM 2012)
- Constraints as Control (POPL 2012)
- Synthesis for Unbounded Bitvector Arithmetic (IJCAR 2012)
- Complete Functional Synthesis (PLDI 2010)
- Test Generation through Programming in UDITA (ICSE 2010)
General talks about these topics were given at several keynotes, lectures, and summaries. Here are a few versions of these talks:
- BoogiePL keynote: August 2011, Wroclaw, PL
- SMT keynote: July 2011, Snowbird, USA. Here is the abstract
- Verimag seminar talk: May 2011, Grenoble, FR
- PUMA talk: February 2011, Munich, DE
The starting point for implicit programming is our recent work on complete functional synthesis and regular synthesis for unbounded domains, interactive synthesis using resolution-based techniques, work on test generation, as well as our work on decision procedures.
Main tools supporting implicit programming are:
- Leon, for synthesis and repair of functional code: http://leon.epfl.ch (see the provided detailed documentation)
- Rosa, for synthesis of sound numerical computation: http://lara.epfl.ch/w/rosa
- InSynth, for Interactive Synthesis
- anyCode, for free-form queries, see https://www.youtube.com/watch?v=tx4-XgAZkKU
- Pong Designer - Programming Games by Demonstration
Earlier tools that explore aspects of implicit programming are:
- Comfusy - Complete Functional Synthesis
- Kaplan - Programming with First-class Constraints
- RegSy - Regular Synthesis over Unbounded Domains
Funding acknowledgement: This project is supported by the European Research Council (ERC) Starting Grant awarded to Viktor Kuncak, who leads the Lab for Automated Reasoning and Analysis at EPFL.