list | Up ]

Viktor Kuncak: Publications and manuscripts

For details on each manuscript, click on the manuscript [number]. For pdf, click on the manuscript title (footnotes).
[160] Proactive Synthesis of Recursive Tree-to-String Functions from Examples. 31st European Conference on Object-Oriented Programming (ECOOP 2017). 2017. [ bib | DOI | http | list ]
[159] Towards a Compiler for Reals. ACM Trans. Program. Lang. Syst. (TOPLAS). 2017. [ bib | DOI | list ]
[158] Contract-Based Resource Verification for Higher-order Functions with Memoization. ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL). 2017. [ bib | list ]
[157] SMT-Based Checking of Predicate-Qualified Types for Scala. Scala Symposium. 2016. [ bib | list ]
[156] An Update on Deductive Synthesis and Repair in the Leon Tool. 5th Workshop on Synthesis. 2016. [ bib | list ]
[155] Translating Scala Programs to Isabelle/HOL (System Description). International Joint Conference on Automated Reasoning (IJCAR). 2016. [ps]bib | list ]
[154] Programming with Enumerable Sets of Structures. OOPSLA. 2015. [ bib | list ]
[153] Synthesizing Java Expressions from Free-Form Queries. OOPSLA. 2015. [ bib | list ]
[152] Automating Grammar Comparison. OOPSLA. 2015. [ bib | list ]
[151] Deductive Program Repair. Computer-Aided Verification (CAV). 2015. [ bib | list ]
[150] Counterexample Guided Quantifier Instantiation for Synthesis in SMT. Computer-Aided Verification (CAV). 2015. [ bib | list ]
[149] Counter-Example Complete Verification for Higher-Order Functions. Scala Symposium. 2015. [ bib | list ]
[148] Sound Reasoning about Integral Data Types with a Reusable SMT Solver Interface. Scala Symposium. 2015. [ bib | list ]
[147] Interactive Synthesis Using Free-Form Queries (Tool Demonstration). International Conference on Software Engineering (ICSE). 2015. [ bib | list ]
[146] Developing Verified Software Using Leon (Invited Contribution). NASA Formal Methods (NFM). 2015. [ bib | list ]
[145] Induction for SMT Solvers. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2015. [ bib | list ]
[144] On Induction for SMT Solvers. EPFL Technical Report EPFL-REPORT-201755. 2014. [ bib | list ]
[143] On Synthesizing Code from Free-Form Queries. EPFL Technical Report EPFL-REPORT-201606. 2014. [ bib | list ]
[142] Synthesizing Functions from Relations in Leon (Invited Contribution). Logic-Based Program Synthesis and Transformation (LOPSTR). 2014. [ bib | list ]
[141] SciFe: Scala Framework for Efficient Enumeration of Data Structures with Invariants. Scala Workshop. 2014. [ bib | list ]
[140] Checking Data Structure Properties Orders of Magnitude Faster. Runtime Verification (RV). 2014. [ bib | list ]
[139] Verifying and Synthesizing Software with Recursive Functions (Invited Contribution). 41st International Colloquium on Automata, Languages, and Programming (ICALP). 2014. [ps]bib | list ]
[138] Symbolic Resource Bound Inference for Functional Programs. Computer Aided Verification (CAV). 2014. [ps]bib | list ]
[137] On Template-Based Inference of Rich Invariants in Leon. EPFL Technical Report EPFL-REPORT-190578. 2013. [ bib | list ]
[136] Sound Compilation of Reals. ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL). 2014. [ bib | list ]
[135] Synthesis Modulo Recursive Functions. OOPSLA. 2013. [ps]bib | list ]
[134] Game Programming by Demonstration. SPLASH Onward!. 2013. [ bib | list ]
[133] Interpolation for Synthesis on Unbounded Domains. Formal Methods in Computer-Aided Design (FMCAD). 2013. [ bib | list ]
[132] Synthesis of Fixed-Point Programs. Embedded Software (EMSOFT). 2013. [ps]bib | list ]
[131] Effect Analysis for Programs with Callbacks. Fifth Working Conference on Verified Software: Theories, Tools and Experiments. 2013. [ bib | list ]
[130] Classifying and Solving Horn Clauses for Verification. Fifth Working Conference on Verified Software: Theories, Tools and Experiments. 2013. [ bib | list ]
[129] Executing Specifications using Synthesis and Constraint Solving (Invited Talk). Runtime Verification (RV). 2013. [ps]bib | list ]
[128] An Overview of the Leon Verification System: Verification by Translation to Recursive Functions. Scala Workshop. 2013. [ps]bib | list ]
[127] Automatic Synthesis of Out-of-Core Algorithms. SIGMOD. 2013. [ps]bib | list ]
[126] On Verification by Translation to Recursive Functions. EPFL Technical Report EPFL-REPORT-186233. 2013. [ bib | list ]
[125] On Integrating Deductive Synthesis and Verification Systems. EPFL Technical Report EPFL-REPORT-186043. 2013. [ bib | list ]
[124] Disjunctive Interpolants for Horn-Clause Verification. Computer Aided Verification (CAV). 2013. [ bib | list ]
[123] Complete Completion using Types and Weights. PLDI. 2013. [ bib | list ]
[122] Software Verification and Graph Similarity for Automated Evaluation of Students' Assignments. Information and Software Technology. 2013. [ bib | list ]
[121] Reductions for Synthesis Procedures. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2013. [ bib | list ]
A synthesis procedure acts as a compiler for declarative specifications. It accepts a formula describing a relation between inputs and outputs, and generates a function implementing this relation. This paper presents synthesis procedures for data structures. Our procedures are reductions that lift a synthesis procedure for the elements into synthesis procedures for containers storing these elements. We introduce a framework to describe synthesis procedures as systematic applications of inference rules. We show that, by interpreting both synthesis problems and programs as relations, we can derive and modularly prove transformation rules that are widely applicable, thus simplifying both the presentation and the correctness argument.

[120] Certifying Solutions for Numerical Constraints. Runtime Verification (RV). 2012. [ bib | list ]
A large portion of software is used for numerical computation in mathematics, physics and engineering. Among the aspects that make verification in this domain difficult is the need to quantify numerical errors, such as roundoff errors and errors due to the use of approximate numerical methods. Much of numerical software uses self-stabilizing iterative algorithms, for example, to find solutions of nonlinear equations. To support such algorithms, we present a runtime verification technique that checks, given a nonlinear equation and a tentative solution, whether this value is indeed a solution to within a specified precision. Our technique combines runtime verification approaches with information about the analytical equation being solved. It is independent of the algorithm used for finding the solution and is therefore applicable to a wide range of problems. We have implemented our technique for the Scala programming language using our affine arithmetic library and the macro facility of Scala 2.10.

[119] Accelerating Interpolants. Automated Technology for Verification and Analysis (ATVA). 2012. [ps]bib | list ]
We present Counterexample-Guided Accelerated Abstraction Refinement (CEGAAR), a new algorithm for verifying infinite-state transition systems. CEGAAR combines interpolation-based predicate discovery in counterexample-guided predicate abstraction with acceleration technique for computing the transitive closure of loops. CEGAAR applies acceleration to dynamically discovered looping patterns in the unfolding of the transition system, and combines overapproximation with underapproximation. It constructs inductive invariants that rule out an infinite family of spurious counterexamples, alleviating the problem of divergence in predicate abstraction without losing its adaptive nature. We present theoretical and experimental justification for the effectiveness of CEGAAR, showing that inductive interpolants can be computed from classical Craig interpolants and transitive closures of loops. We present an implementation of CEGAAR that verifies integer transition systems. We show that the resulting implementation robustly handles a number of difficult transition systems that cannot be handled using interpolation-based predicate abstraction or acceleration alone.

[118] A Verification Toolkit for Numerical Transition Systems (Tool Paper). 16th International Symposium on Formal Methods (FM). 2012. [ bib | list ]
This paper reports on an effort to create benchmarks and a toolkit for rigorous verification problems, simplifying tool integration and eliminating ambiguities of complex programming language constructs. We focus on Integer Numerical Transition Systems (INTS), which can be viewed as control-flow graphs whose edges are annotated by Presburger arithmetic formulas. We describe the syntax, semantics, a front-end, and a first release of benchmarks for such transition systems. Furthermore, we present Flata and Eldarica, two new verification tools for INTS. The Flata system is based on precise acceleration of the transition relation, while the Eldarica system is based on predicate abstraction with interpolation-based counterexample-driven refinement. The Eldarica verifier uses the Princess theorem prover as a sound and complete interpolating prover for Presburger arithmetic. Both systems can solve several examples for which previous approaches failed and present a useful baseline for verifying integer programs. Our infrastructure is publicly available; we hope that it will spur further research, benchmarking, competitions, and synergistic communication between verification tools.

[117] Synthesis for Unbounded Bitvector Arithmetic. International Joint Conference on Automated Reasoning (IJCAR). 2012. [ bib | list ]
We propose to describe computations using QFPAbit, a language of quantifier-free linear arithmetic on unbounded integers with bitvector operations. We describe an algorithm that, given a QFPAbit formula with input and output variables denoting integers, generates an efficient function from a sequence of inputs to a sequence of outputs, whenever such function on integers exists. The starting point for our method is a polynomial-time translation mapping a QFPAbit formula into the sequential circuit that checks the correctness of the input/output relation. From such a circuit, our synthesis algorithm produces solved circuits from inputs to outputs that are no more than singly exponential in size of the original formula. In addition to the general synthesis algorithm, we present techniques that ensure that, for example, multiplication and division with large constants do not lead to an exponential blowup, addressing a practical problem with a previous approach that used the MONA tool to generate the specification automata.

[116] Speculative Linearizability. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2012. [ps]bib | list ]
Linearizability is a key design methodology for reasoning about implementations of concurrent abstract data types in both shared memory and message passing systems. It provides the illusion that operations execute sequentially and fault-free, despite the asynchrony and faults inherent to a concurrent system, especially a distributed one. A key property of linearizability is inter-object composability: a system composed of linearizable objects is itself linearizable. However, devising linearizable objects is very difficult, requiring complex algorithms to work correctly under general circumstances, and often resulting in bad average-case behavior. Concurrent algorithm designers therefore resort to speculation: optimizing algorithms to handle common scenarios more efficiently. The outcome are even more complex protocols, for which it is no longer tractable to prove their correctness. To simplify the design of efficient yet robust linearizable protocols, we propose a new notion: speculative linearizability. This property is as general as linearizability, yet it allows intra-object composability: the correctness of independent protocol phases implies the correctness of their composition. In particular, it allows the designer to focus solely on the proof of an optimization and derive the correctness of the overall protocol from the correctness of the existing, non-optimized one. Our notion of protocol phases allows processes to independently switch from one phase to another, without requiring them to reach agreement to determine the change of a phase. To illustrate the applicability of our methodology, we show how examples of speculative algorithms for shared memory and asynchronous message passing naturally fit into our framework. We rigorously define speculative linearizability and prove our intra-object composition theorem in a trace-based as well as an automaton-based model. To obtain a further degree of confidence, we also formalize and mechanically check the theorem in the automaton-based model, using the I/O automata framework within the Isabelle interactive proof assistant.

[115] Software Synthesis Procedures. Communications of the ACM. 2012. [ps]bib | list ]
Automated synthesis of program fragments from specifications can make programs easier to write and easier to reason about. To integrate synthesis into programming languages software synthesis algorithms should behave in a predictable way: they should succeed for a well-defined class of specifications. We propose to systematically generalize decision procedures into synthesis procedures, and use them to compile implicitly specified computations embedded inside functional and imperative programs. Synthesis procedures are predictable, because they are guaranteed to find code that satisfies the specification whenever such code exists. To illustrate our method, we derive synthesis procedures by extending quantifier elimination algorithms for integer arithmetic and set data structures. We then show that an implementation of such synthesis procedures can extend a compiler to support implicit value definitions and advanced pattern matching.

[114] Functional Synthesis for Linear Arithmetic and Sets. Software Tools for Technology Transfer (STTT). 2012. [ bib | list ]
Synthesis of program fragments from specifications can make programs easier to write and easier to reason about. To integrate synthesis into programming languages, synthesis algorithms should behave in a predictable way-they should succeed for a well-defined class of specifications. To guarantee correctness and applicability to software (and not just hardware), these algorithms should also support unbounded data types, such as numbers and data structures. To obtain appropriate synthesis algorithms, we propose to generalize decision procedures into predictable and complete synthesis procedures. Such procedures are guaranteed to find code that satisfies the specification if such code exists. Moreover, we identify conditions under which synthesis will statically decide whether the solution is guaranteed to exist, and whether it is unique. We demonstrate our approach by starting from a quantifier elimination decision procedure for Boolean Algebra of set with Presburger Arithmetic (BAPA) and transforming it into a synthesis procedure. Our procedure also works in the presence of parametric coefficients. We establish results on the size and the efficiency of the synthesized code. We show that such procedures are useful as a language extension with implicit value definitions, and we show how to extend a compiler to support such definitions. Our constructs provide the benefits of synthesis to programmers, without requiring them to learn new concepts, give up a deterministic execution model, or provide code skeletons.

[113] Deciding Functional Lists with Sublist Sets. Verified Software: Theories, Tools and Experiments (VSTTE). 2012. [ps]bib | list ]
Motivated by the problem of deciding verification conditions for the verification of functional programs, we present new decision procedures for automated reasoning about functional lists. We first show how to decide in NP the satisfiability problem for logical constraints containing equality, constructor, selectors, as well as the transitive sublist relation. We then extend this class of constraints with operators to compute the set of all sublists, and the set of objects stored in a list. Finally, we support constraints on sizes of sets, which gives us the ability to compute list length as well as the number of distinct list elements. We show that the extended theory is reducible to the theory of sets with linear cardinality constraints, and therefore still in NP. This reduction enables us to combine our theory with other decidable theories that impose constraints on sets of objects, which further increases the potential of our decidability result in verification of functional and imperative software.

[112] Development and Evaluation of LAV: an SMT-Based Error Finding Platform. Verified Software: Theories, Tools and Experiments (VSTTE). 2012. [ps]bib | list ]
We present design and evaluation of LAV, a new open-source tool for statically checking program assertions and errors. LAV integrates into the popular LLVM infrastructure for compilation and analysis. LAV uses symbolic execution to construct a first-order logic formula that models the behavior of each basic blocks. It models the relationships between basic blocks using propositional formulas. By combining these two kinds of formulas LAV generates polynomial-sized verification conditions for loop-free code. It uses underapproximating or overapproximating unrolling to handle loops. LAV can pass generated verification conditions to one of the several SMT solvers: Boolector, MathSAT, Yices, and Z3. Our experiments with small 200 benchmarks suggest that LAV is competitive with related tools, so it can be used as an effective alternative for certain verification tasks. The experience also shows that LAV provides significant help in analyzing student programs and providing feedback to students in everyday university practice.

[111] Constraints as Control. ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL). 2012. [ps]bib | list ]
We present an extension of Scala that supports constraint programming over bounded and unbounded domains. The resulting language, Kaplan, provides the benefits of constraint programming while preserving the existing features of Scala. Kaplan integrates constraint and imperative programming by using constraints as an advanced control structure; the developers use the monadic 'for' construct to iterate over the solutions of constraints or branch on the existence of a solution. The constructs we introduce have simple semantics that can be understood as explicit enumeration of values, but are implemented more efficiently using symbolic reasoning. Kaplan programs can manipulate constraints at run-time, with the combined benefits of type-safe syntax trees and first-class functions. The language of constraints is a functional subset of Scala, supporting arbitrary recursive function definitions over algebraic data types, sets, maps, and integers. Our implementation runs on a platform combining a constraint solver with a standard virtual machine. For constraint solving we use an algorithm that handles recursive function definitions through fair function unrolling and builds upon the state-of-the art SMT solver Z3. We evaluate Kaplan on examples ranging from enumeration of data structures to execution of declarative specifications. We found Kaplan promising because it is expressive, supporting a range of problem domains, while enabling full-speed execution of programs that do not rely on constraint programming.

[110] Trustworthy Numerical Computation in Scala. OOPSLA. 2011. [ps]bib | list ]
Modern computing has adopted the floating point type as a default way to describe computations with real numbers. Thanks to dedicated hardware support, such computations are efficient on modern architectures, even in double precision. However, rigorous reasoning about the resulting programs remains difficult. This is in part due to a large gap between the finite floating point representation and the infinite-precision real-number semantics that serves as the developers' mental model. Because programming languages do not provide support for estimating errors, some computations in practice are performed more and some less precisely than needed. We present a library solution for rigorous arithmetic computation. Our numerical data type library tracks a (double) floating point value, but also a guaranteed upper bound on the error between this value and the ideal value that would be computed in the real-value semantics. Our implementation involves a set of linear approximations based on an extension of affine arithmetic. The derived approximations cover most of the standard mathematical operations, including trigonometric functions, and are more comprehensive than any publicly available ones. Moreover, while interval arithmetic rapidly yields overly pessimistic estimates, our approach remains precise for several computational tasks of interest. We evaluate the library on a number of examples from numerical analysis and physical simulations. We found it to be a useful tool for gaining confidence in the correctness of the computation.

[109] Satisfiability Modulo Recursive Programs. Static Analysis Symposium (SAS). 2011. [ps]bib | list ]
We present a semi-decision procedure for checking expressive correctness properties of recursive first-order functional programs. In our approach, both properties and programs are expressed in the same language, which is a subset of Scala. We have implemented our procedure and integrated it with the Z3 SMT solver and the Scala compiler. Our procedure is sound for proofs and counterexamples. It is terminating and thus complete for many important classes of specifications, including 1) all verification problems containing counterexamples, 2) functions annotated with inductive postconditions, and 3) abstraction functions and invariants of recursive data types that commonly arise in functional programs. Using our system, we verified detailed correctness properties for functional data structure implementations, as well as Scala syntax tree manipulations. We have found our system to be fast for both finding counterexamples and finding correctness proofs, and to scale to larger programs than alternative techniques.

[108] An Efficient Decision Procedure for Imperative Tree Data Structures. Computer-Aideded Deduction (CADE). 2011. [ bib | list ]
We present a new decidable logic called TREX for expressing constraints about imperative tree data structures. In particular, TREX supports a transitive closure operator that can express reachability constraints, which often appear in data structure invariants. We show that our logic is closed under weakest precondition computation, which enables its use for automated software verification. We further show that satisfiability of formulas in TREX is decidable in NP. The low complexity makes it an attractive alternative to more expensive logics such as monadic second-order logic (MSOL) over trees, which have been traditionally used for reasoning about tree data structures.

[107] Scala to the Power of Z3: Integrating SMT and Programming. Computer-Aideded Deduction (CADE) Tool Demo. 2011. [ bib | list ]
We describe a system that integrates the SMT solver Z3 with the Scala programming language. The system supports the use of the SMT solver for checking satisfiability, unsatisfiability, as well as solution enumeration. The embedding of formula trees into Scala uses the host type system of Scala to prevent the construction of certain ill-typed constraints. The solution enumeration feature integrates into the iteration constructions of Scala and supports writing non-deterministic programs. Using Z3's mechanism of theory extensions, our system also helps users construct custom constraint solvers where the interpretation of predicates and functions is given as Scala code. The resulting system preserves the productivity advantages of Scala while simplifying tasks such as combinatorial search.

[106] Interactive Synthesis of Code Snippets. Computer Aided Verification (CAV) Tool Demo. 2011. [ bib | list ]
We describe a tool that applies theorem proving technology to synthesize code fragments that use given library functions. To determine candidate code fragments, our approach takes into account polymorphic type constraints as well as test cases. Our tool interactively displays a ranked list of suggested code fragments that are appropriate for the current program point. We have found our system to be useful for synthesizing code fragments for common programming tasks, and we believe it is a good platform for exploring software synthesis techniques.

[105] Sets with Cardinality Constraints in Satisfiability Modulo Theories. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2011. [ps]bib | list ]
Boolean Algebra with Presburger Arithmetic (BAPA) is a decidable logic that can express constraints on sets of elements and their cardinalities. Problems from verification of complex properties of software often contain fragments that belong to quantifier-free BAPA (QFBAPA). In contrast to many other NP-complete problems (such as quantifier-free first-order logic or linear arithmetic), the applications of QFBAPA to a broader set of problems has so far been hindered by the lack of an efficient implementation that can be used alongside other efficient decision procedures. We overcome these limitations by extending the efficient SMT solver Z3 with the ability to reason about cardinality (QFBAPA) constraints. Our implementation uses the DPLL(T) mechanism of Z3 to reason about the top-level propositional structure of a QFBAPA formula, improving the efficiency compared to previous implementations. Moreover, we present a new algorithm for automatically decomposing QFBAPA formulas. Our algorithm alleviates the exponential explosion of considering all Venn regions, significantly improving the tractability of formulas with many set variables. Because it is implemented as a theory plugin, our implementation enables Z3 to prove formulas that use QFBAPA constructs with constructs from other theories that Z3 supports, as well as with quantifiers. We have applied our implementation to the verification of functional programs; we show it can automatically prove formulas that no automated approach was reported to be able to prove before.

[104] Towards Complete Reasoning about Axiomatic Specifications. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2011. [ps]bib | list ]
To support verification of expressive properties of functional programs, we consider algebraic style specifications that may relate multiple user-defined functions, and compare multiple invocations of a function for different arguments. We present decision procedures for reasoning about such universally quantified properties of functional programs, using local theory extension methodology. We establish new classes of universally quantified formulas whose satisfiability can be checked in a complete way by finite quantifier instantiation. These classes include single-invocation axioms that generalize standard function contracts, but also certain many-invocation axioms, specifying that functions satisfy congruence, injectivity, or monotonicity with respect to abstraction functions, as well as conjunctions of some of these properties. These many-invocation axioms can specify correctness of abstract data type implementations as well as certain information-flow properties. We also present a decidability-preserving construction that enables the same function to be specified using different classes of decidable specifications on different partitions of its domain.

[103] On Satisfiability Modulo Computable Functions. EPFL Technical Report EPFL-REPORT. 2010. [ bib | list ]
[102] On Rigorous Numerical Computation as a Scala Library. EPFL Technical Report EPFL-REPORT-158754. 2010. [ bib | list ]
[101] Phantm: PHP Analyzer for Type Mismatch (Research Demonstration). ACM SIGSOFT Conference on Foundations of Software Engineering (FSE). 2010. [ps]bib | list ]
We present Phantm, a static analyzer that uses a flow-sensitive analysis to detect type errors in PHP applications. Phantm can infer types for nested arrays, and can leverage runtime information and procedure summaries for more precise results. Phantm found over 200 true problems when applied to three applications with over 50'000 lines of code, including the popular DokuWiki code base.

[100] Runtime Instrumentation for Precise Flow-Sensitive Type Analysis. International Conference on Runtime Verification. 2010. [ps]bib | list ]
We describe a combination of runtime information and static analysis for checking properties of complex and configurable systems. The basic idea of our approach is to 1) let the program execute and thereby read the important dynamic configuration data, then 2) invoke static analysis from this runtime state to detect possible errors that can happen in the continued execution. This approach improves analysis precision, particularly with respect to types of global variables and nested data structures. It also enables the resolution of modules that are loaded based on dynamically computed information. We describe an implementation of this approach in a tool that statically computes possible types of variables in PHP applications, including detailed types of nested maps (arrays). PHP is a dynamically typed language; PHP programs extensively use nested value maps, as well as 'include' directives whose arguments are dynamically computed file names. We have applied our analysis tool to over 50'000 lines of PHP code, including the popular DokuWiki software, which has a plug-in architecture. The analysis identified 200 problems in the code and in the type hints of the original source code base. Some of these problems can cause exploits, infinite loops, and crashes. Our experiments show that dynamic information simplifies the development of the analysis and decreases the number of false alarms compared to a purely static analysis approach.

[99] Synthesis for Regular Specifications over Unbounded Domains. FMCAD. 2010. [ps]bib | list ]
Synthesis from declarative specifications is an ambitious automated method for obtaining systems that are correct by construction. Previous work includes synthesis of reactive finite-state systems from linear temporal logic and its fragments. Further recent work focuses on a different application area by doing functional synthesis over unbounded domains, using a modified Presburger arithmetic quantifier elimination algorithm. We present new algorithms for functional synthesis over unbounded domains based on automata-theoretic methods, with advantages in the expressive power and in the efficiency of synthesized code. Our approach synthesizes functions that meet given regular specifications defined over unbounded sequences of input and output bits. Thanks to the translation from weak monadic second-order logic to automata, this approach supports full Presburger arithmetic as well as bitwise operations on arbitrary length integers. The presence of quantifiers enables finding solutions that optimize a given criterion. Unlike synthesis of reactive systems, our notion of realizability allows functions that require examining the entire input to compute the output. Regardless of the complexity of the specification, our algorithm synthesizes linear-time functions that read the input and directly produce the output. We also describe a technique to synthesize functions with bounded lookahead when possible, which is appropriate for streaming implementations. We implemented our synthesis algorithm and show that it synthesizes efficient functions on a number of examples.

[98] Ordered Sets in the Calculus of Data Structures (Invited Paper). CSL. 2010. [ps]bib | list ]
Our goal is to identify families of relations that are useful for reasoning about software. We describe such families using decidable quantifier-free classes of logical constraints with a rich set of operations. A key challenge is to define such classes of constraints in a modular way, by combining multiple decidable classes. Working with quantifier-free combinations of constraints makes the combination agenda more realistic and the resulting logics more likely to be tractable than in the presence of quantifiers. Our approach to combination is based on reducing decidable fragments to a common class, Boolean Algebra with Presburger Arithmetic (BAPA). This logic was introduced by Feferman and Vaught in 1959 and can express properties of uninterpreted sets of elements, with set algebra operations and equicardinality relation (consequently, it can also express Presburger arithmetic constraints on cardinalities of sets). Combination by reduction to BAPA allows us to obtain decidable quantifier-free combinations of decidable logics that share BAPA operations. We use the term Calculus of Data Structures to denote a family of decidable constraints that reduce to BAPA. This class includes, for example, combinations of formulas in BAPA, weak monadic second-order logic of k-successors, two-variable logic with counting, and term algebras with certain homomorphisms. The approach of reduction to BAPA generalizes the Nelson-Oppen combination that forms the foundation of constraint solvers used in software verification. BAPA is convenient as a target for reductions because it admits quantifier elimination and its quantifier-free fragment is NP-complete. We describe a new member of the Calculus of Data Structures: a quantifier-free fragment that supports 1) boolean algebra of finite and infinite sets of real numbers, 2) linear arithmetic over real numbers, 3) formulas that can restrict chosen set or element variables to range over integers (providing, among others, the power of mixed integer arithmetic and sets of integers), 4) the cardinality operators, stating whether a given set has a given finite cardinality or is infinite, 5) infimum and supremum operators on sets. Among the applications of this logic are reasoning about the externally observable behavior of data structures such as sorted lists and priority queues, and specifying witness functions for the BAPA synthesis problem. We describe an abstract reduction to BAPA for our logic, proving that the satisfiability of the logic is in NP and that it can be combined with the other fragments of the Calculus of Data Structures.

[97] On Deciding Functional Lists with Sublist Sets. EPFL Technical Report EPFL-REPORT-148361. 2010. [ps]bib | list ]
Motivated by the problem of deciding verification conditions for the verification of functional programs, we present new decision procedures for automated reasoning about functional lists. We first show how to decide in NP the satisfiability problem for logical constraints containing equality, constructor, selectors, as well as the transitive sublist relation. We then extend this class of constraints with operators to compute the set of all sublists, and the set of objects stored in a list. Finally, we support constraints on sizes of sets, which gives us the ability to compute list length as well as the number of distinct list elements. We show that the extended theory is reducible to the theory of sets with linear cardinality constraints, and therefore still in NP. This reduction enables us to combine our theory with other decidable theories that impose constraints on sets of objects, which further increases the potential of our decidability result in verification of functional and imperative software.

[96] On Locality of One-Variable Axioms and Piecewise Combinations. EPFL Technical Report EPFL-REPORT-148180. 2010. [ps]bib | list ]
Local theory extensions provide a complete and efficient way for reasoning about satisfiability of certain kinds of universally quantified formulas modulo a background theory. They are therefore useful in automated reasoning, software verification, and synthesis. In this paper, we 1) provide a sufficient condition for locality of axioms with one variable specifying functions and relations, 2) show how to obtain local axiomatizations from non-local ones in some cases, 3) show how to obtain piecewise combination of different local theory extensions.

[95] MUNCH - Automated Reasoner for Sets and Multisets (System Description). IJCAR. 2010. [ps]bib | list ]
This system description provides an overview of the MUNCH reasoner for sets and multisets. MUNCH takes as the input a formula in a logic that supports expressions about sets, multisets, and integers. Constraints over collections and integers are connected using the cardinality operator. Our logic is a fragment of logics of popular interactive theorem provers, and MUNCH is the first fully automated reasoner for this logic. MUNCH reduces input formulas to equisatisfiable linear integer arithmetic formulas. MUNCH reasoner is publicly available. It is implemented in the Scala programming language and currently uses the SMT solver Z3 to solve the generated integer linear arithmetic constraints.

[94] Comfusy: Complete Functional Synthesis (Tool Presentation). CAV. 2010. [ps]bib | list ]
Synthesis of program fragments from specifications can make programs easier to write and easier to reason about. We present Comfusy, a tool that extends the compiler for the general-purpose programming language Scala with (non-reactive) functional synthesis over unbounded domains. Comfusy accepts expressions with input and output variables specifying relations on integers and sets. Comfusy symbolically computes the precise domain for the given relation and generates the function from inputs to outputs. The outputs are guaranteed to satisfy the relation whenever the inputs belong to the relation domain. The core of our synthesis algorithm is an extension of quantifier elimination that generates programs to compute witnesses for eliminated variables. We present examples that demonstrate software synthesis using Comfusy, and illustrate how synthesis simplifies software development.

[93] Complete Functional Synthesis. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2010. [ps]bib | list ]
Synthesis of program fragments from specifications can make programs easier to write and easier to reason about. To integrate synthesis into programming languages, synthesis algorithms should behave in a predictable way-they should succeed for a well-defined class of specifications. They should also support unbounded data types such as numbers and data structures. We propose to generalize decision procedures into predictable and complete synthesis procedures. Such procedures are guaranteed to find code that satisfies the specification if such code exists. Moreover, we identify conditions under which synthesis will statically decide whether the solution is guaranteed to exist, and whether it is unique. We demonstrate our approach by starting from decision procedures for linear arithmetic and data structures and transforming them into synthesis procedures. We establish results on the size and the efficiency of the synthesized code. We show that such procedures are useful as a language extension with implicit value definitions, and we show how to extend a compiler to support such definitions. Our constructs provide the benefits of synthesis to programmers, without requiring them to learn new concepts or give up a deterministic execution model.

[92] On Decision Procedures for Ordered Collections. EPFL Technical Report LARA-REPORT-2010-001. 2010. [ps]bib | list ]
We describe a decision procedure for a logic that supports 1) finite collections of elements (sets or multisets) 2) the cardinality operator 3) a total order relation on elements, 4) min and max operators on entire collections. Among the applications of this logic are 1) reasoning about the externally observable behavior of data structures such as random access priority queues, 2) specifying witness functions for synthesis problem of set algebra, and 3) reasoning about constraints on orderings arising in termination proofs.

[91] Test Generation through Programming in UDITA. International Conference on Software Engineering (ICSE). 2010. [ps]bib | http | list ]
We present an approach for describing tests using non-deterministic test generation programs. To write such programs, we introduce UDITA, a Java-based language with non-deterministic choice operators and an interface for generating linked structures. We also describe new algorithms that generate concrete tests by efficiently exploring the space of all executions of non-deterministic UDITA programs. We implemented our approach and incorporated it into the official, publicly available repository of Java PathFinder (JPF), a popular tool for verifying Java programs. We evaluate our technique by generating tests for data structures, refactoring engines, and JPF itself. Our experiments show that test generation using UDITA is faster and leads to test descriptions that are easier to write than in previous frameworks. Moreover, the novel execution mechanism of UDITA is essential for making test generation feasible. Using UDITA, we have discovered a number of bugs in Eclipse, NetBeans, Sun javac, and JPF.

[90] Predicting and Preventing Inconsistencies in Deployed Distributed Systems. ACM Transactions on Computer Systems. 2010. [ps]bib | list ]
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This article describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

[89] Building a Calculus of Data Structures (invited paper). Verification, Model Checking, and Abstract Interpretation (VMCAI). 2010. [ps]bib | list ]
Techniques such as verification condition generation, predicate abstraction, and expressive type systems reduce software verification to proving formulas in expressive logics. Programs and their specifications often make use of data structures such as sets, multisets, algebraic data types, or graphs. Consequently, formulas generated from verification also involve such data structures. To automate the proofs of such formulas we propose a logic (a 'calculus') of such data structures. We build the calculus by starting from decidable logics of individual data structures, and connecting them through functions and sets, in ways that go beyond the frameworks such as Nelson-Oppen. The result are new decidable logics that can simultaneously specify properties of different kinds of data structures and overcome the limitations of the individual logics. Several of our decidable logics include abstraction functions that map a data structure into its more abstract view (a tree into a multiset, a multiset into a set), into a numerical quantity (the size or the height), or into the truth value of a candidate data structure invariant (sortedness, or the heap property). For algebraic data types, we identify an asymptotic many-to-one condition on the abstraction function that guarantees the existence of a decision procedure. In addition to the combination based on abstraction functions, we can combine multiple data structure theories if they all reduce to the same data structure logic. As an instance of this approach, we describe a decidable logic whose formulas are propositional combinations of formulas in: weak monadic second-order logic of two successors, two-variable logic with counting, multiset algebra with Presburger arithmetic, the Bernays-Schönfinkel-Ramsey class of first-order logic, and the logic of algebraic data types with the set content function. The subformulas in this combination can share common variables that refer to sets of objects along with the common set algebra operations. Such sound and complete combination is possible because the relations on sets definable in the component logics are all expressible in Boolean Algebra with Presburger Arithmetic. Presburger arithmetic and its new extensions play an important role in our decidability results. In several cases, when we combine logics that belong to NP, we can prove the satisfiability for the combined logic is still in NP.

[88] Collections, Cardinalities, and Relations. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2010. [ps]bib | list ]
Logics that involve collections (sets, multisets), and cardinality constraints are useful for reasoning about unbounded data structures and concurrent processes. To make such logics more useful in verification this paper extends them with the ability to compute direct and inverse relation and function images. We establish decidability and complexity bounds for the extended logics.

[87] Decision Procedures for Algebraic Data Types with Abstractions. ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL). 2010. [ps]bib | list ]
We describe a family of decision procedures that extend the decision procedure for quantifier-free constraints on recursive algebraic data types (term algebras) to support recursive abstraction functions. Our abstraction functions are catamorphisms (term algebra homomorphisms) mapping algebraic data type values into values in other decidable theories (e.g. sets, multisets, lists, integers, booleans). Each instance of our decision procedure family is sound; we identify a widely applicable many-to-one condition on abstraction functions that implies the completeness. Complete instances of our decision procedure include the following correctness statements: 1) a functional data structure implementation satisfies a recursively specified invariant, 2) such data structure conforms to a contract given in terms of sets, multisets, lists, sizes, or heights, 3) a transformation of a formula (or lambda term) abstract syntax tree changes the set of free variables in the specified way.

[86] On Complete Functional Synthesis. EPFL Technical Report LARA-REPORT-2009-006. 2009. [ps]bib | list ]
[85] On Test Generation through Programming in UDITA. EPFL, IC Technical Report LARA-REPORT-2009-05. 2009. [ps]bib | http | list ]
We present an approach for describing tests using non-deterministic test generation programs. To write test generation programs, we introduce UDITA, a Java-based language with non-deterministic choice operators and an interface for generating linked structures. We also describe new algorithms that generate concrete tests by efficiently exploring the space of all executions of non-deterministic UDITA programs. We implemented our approach and incorporated it into the official, publicly available repository of Java PathFinder (JPF), a popular tool for verifying Java programs. We evaluate our technique by generating tests for data structures, refactoring engines, and JPF itself. Our experiments show that test generation using UDITA is faster and leads to test descriptions that are easier to write than in previous frameworks. Moreover, the novel execution mechanism of UDITA is essential for making test generation feasible. Using UDITA, we have discovered a number of previously unknown bugs in Eclipse, NetBeans, Sun javac, and JPF.

[84] On Decision Procedures for Collections, Cardinalities, and Relations. EPFL Technical Report LARA-REPORT-2009-004. 2009. [ps]bib | list ]
Logics that involve collections (sets, multisets), and cardinality constraints are useful for reasoning about unbounded data structures and concurrent processes. To make such logics more useful in verification this paper extends them with the ability to compute direct and inverse relation and function images. We establish decidability and complexity bounds for the extended logics.

[83] On Decision Procedures for Algebraic Data Types with Abstractions. EPFL Technical Report LARA-REPORT-2009-003. 2009. [ps]bib | list ]
We describe a parameterized decision procedure that extends the decision procedure for functional recursive algebraic data types (trees) with the ability to specify and reason about abstractions of data structures. The abstract values are specified using recursive abstraction functions that map trees into other data types that have decidable theories. Our result yields a decidable logic which can be used to prove that implementations of functional data structures satisfy recursively specified invariants and conform to interfaces given in terms of sets, multisets, or lists, or to increase the automation in proof assistants.

[82] Combining Theories with Shared Set Operations. FroCoS: Frontiers in Combining Systems. 2009. [ps]bib | list ]
Motivated by applications in software verification, we explore automated reasoning about the non-disjoint combination of theories of infinitely many finite structures, where the theories share set variables and set operations. We prove a combination theorem and apply it to show the decidability of the satisfiability problem for a class of formulas obtained by applying propositional connectives to formulas belonging to: 1) Boolean Algebra with Presburger Arithmetic (with quantifiers over sets and integers), 2) weak monadic second-order logic over trees (with monadic second-order quantifiers), 3) two-variable logic with counting quantifiers (ranging over elements), 4) the EPR class of first-order logic with equality (with exists*forall* quantifier prefix), and 5) the quantifier-free logic of multisets with cardinality constraints.

[81] On Combining Theories with Shared Set Operations. EPFL Technical Report LARA-REPORT-2009-002. 2009. [ bib | list ]
[80] Simplifying Distributed System Development. 12th Workshop on Hot Topics in Operating Systems. 2009. [ps]bib | list ]
Distributed systems are difficult to design and develop. The difficulties arise both in basic safety correctness properties, and in achieving high performance. As a result of this complexity, the implementation of a distributed system often contains the basic algorithm coupled with an embedded strategy for making choices, such as the choice of a node to interact with. This paper proposes a programming model for distributed systems where 1) the application explicitly exposes the choices (decisions) that it needs to make as well as the objectives that it needs to maximize; 2) the application and the runtime system cooperate to maintain a predictive model of the distributed system and its environment; and 3) the runtime uses the predictive model to resolve the choices as to maximize the objectives. We claim that this programming model results in simpler source code and lower development effort, and can lead to increased performance and robustness to various deployment settings. Our initial results of applying this model to a sample application are encouraging.

[79] An Integrated Proof Language for Imperative Programs. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2009. [ps]bib | list ]
We present an integrated proof language for guiding the actions of multiple reasoning systems as they work together to prove complex correctness properties of imperative programs. The language operates in the context of a program verification system that uses multiple reasoning systems to discharge generated proof obligations. It is designed to 1) enable developers to resolve key choice points in complex program correctness proofs, thereby enabling automated reasoning systems to successfully prove the desired correctness properties; 2) allow developers to identify key lemmas for the reasoning systems to prove, thereby guiding the reasoning systems to find an effective proof decomposition; 3) enable multiple reasoning systems to work together productively to prove a single correctness property by providing a mechanism that developers can use to divide the property into lemmas, each of which is suitable for a different reasoning system; and 4) enable developers to identify specific lemmas that the reasoning systems should use when attempting to prove other lemmas or correctness properties, thereby appropriately confining the search space so that the reasoning systems can find a proof in an acceptable amount of time. The language includes a rich set of declarative proof constructs that enables developers to direct the reasoning systems as little or as much as they desire. Because the declarative proof statements are embedded into the program as specialized comments, they also serve as verified documentation and are a natural extension of the assertion mechanism found in most program verification systems. We have implemented our integrated proof language in the context of a program verification system for Java and used the resulting system to verify a collection of linked data structure implementations. Our experience indicates that our proof language makes it possible to successfully prove complex program correctness properties that are otherwise beyond the reach of automated reasoning systems.

[78] On Set-Driven Combination of Logics and Verifiers. EPFL Technical Report LARA-REPORT-2009-001. 2009. [ bib | list ]
We explore the problem of automated reasoning about the non-disjoint combination of logics that share set variables and operations. We prove a general combination theorem, and apply it to show the decidability for the quantifier-free combination of formulas in WS2S, two-varible logic with counting, and Boolean Algebra with Presburger Arithmetic. Furthermore, we present an over-approximating algorithm that uses such combined logics to synthesize universally quantified invariants of infinite-state systems. The algorithm simultaneously synthesizes loop invariants of interest, and infers the relationships between sets to exchange the information between logics. We have implemented this algorithm and used it to prove detailed correctness properties of operations of linked data structure implementations.

[77] Simplifying Distributed System Development. EPFL Technical Report NSL-REPORT-2009-001. 2009. [ bib | list ]
Distributed systems are difficult to design and develop. The difficulties arise both in basic safety correctness properties, and in achieving high performance. As a result of this complexity, the implementation of a distributed system often contains the basic algorithm coupled with an embedded strategy for making choices, such as the choice of a node to interact with. This paper proposes a programming model for distributed systems where 1) the application explicitly exposes the choices (decisions) that it needs to make as well as the objectives that it needs to maximize; 2) the application and the runtime system cooperate to maintain a predictive model of the distributed system and its environment; and 3) the runtime uses the predictive model to resolve the choices as to maximize the objectives. We claim that this programming model results in simpler source code and lower development effort, and can lead to increased performance and robustness to various deployment settings. Our initial results of applying this model to a sample application are encouraging.

[76] CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems. 6th USENIX Symp. Networked Systems Design and Implementation (NSDI '09). 2009. [ps]bib | .html | list ]
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

[75] Opis: Reliable Distributed Systems in OCaml. ACM SIGPLAN TLDI. 2009. [ bib | list ]
Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Several specialized domain-specific languages and extensions of memory-unsafe languages were proposed to aid distributed system development. We present an alternative to these approaches, showing that modern, higher-order, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems. We present Opis, a functional-reactive approach for developing distributed systems in Objective Caml. An Opis protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. Such architecture aids reasoning about event functions both informally and using interactive theorem provers. For example, it facilitates simple termination arguments. Given a protocol description, a developer can use higher-order library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with full-replay capabilities, 3) apply explicit-state model checking to the distributed system, detecting undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peer-to-peer overlay protocols, including the Chord distributed hash table and the Cyclon random gossip protocol. We found that using Opis results in high programmer productivity and leads to easily composable protocol descriptions. Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

[74] On Delayed Choice Execution for Falsification. EPFL, IC Technical Report LARA-REPORT-2008-08. 2008. [ bib | list ]
[73] Constraint Solving for Software Reliability and Text Analysis (Research Plan). EPFL Technical Report LARA-REPORT-2008-007. 2008. [ bib | list ]
We will develop and implement new algorithms for constraint solving and apply them to construct two classes of tools: 1) bug finding and verification tools building on tools such as Java PathFinder and Jahob; 2) tools for deep semantic analysis of texts containing a mix of English, source code, and mathematical formulas. The starting point for our techniques are constraint solving algorithms developed in the rapidly expanding field of satisfiability modulo theories (SMT). We will use state-of-the art techniques to implement an SMT constraint solver in the Scala programming language running on JVM platforms. One of the distinguishing features of our constraint solver will be the ability to analyze rich constraints that include not only quantifiers, numerical domains and data structures, but also lambda expressions and recursive definitions. To be effective for program analysis, the constraint solver will have a native support for transition systems describing program semantics. This will enable it to tackle sources of exponential explosion in context-sensitive and path sensitive analyses such as symbolic execution. Such analyses can identify a wide range of bugs, many of which can cause crashes and security problems. In the area of text processing, we expect our constraint solver to enable efficient reasoning about rich semantic domains arising in computational semantics of natural language. This capability will make the solver a useful component of tools for creating semantically annotated text and post-processing search results in scientific and engineering domains. To evaluate this hypothesis, we will develop a tool for analyzing text whose subject is explanation of source code, programming language semantics, compilation, and program analysis.

[72] Opis: Reliable Distributed Systems in OCaml. EPFL-IC-NSL Technical Report NSL-REPORT-2008-001. 2008. [ bib | list ]
The importance of distributed systems is growing as computing devices become ubiquitous and bandwidth becomes plentiful. Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Recently, several specialized domain-specific languages and extensions of memory-unsafe languages have been proposed to aid distributed system development. In this paper we propose an alternative to these approaches, arguing that modern, higher-order, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems. We present Opis, a functional-reactive approach for developing distributed systems in Objective Caml. In Opis, a protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. This facilitates reasoning about event functions both informally and using interactive provers. For example, this approach leads to simple termination arguments. Given a protocol description, a developer can use higher-order library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with full-replay capabilities, 3) apply explicit-state model checking to the distributed system and detect any undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peer-to-peer overlay protocols including the Chord distributed hash table and the Cyclon random gossip protocol. We have found that using Opis results in high programmer productivity and leads to concise and easily composable protocol descriptions. Moreover, Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

[71] CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems. EPFL-IC-LARA Technical Report LARA-REPORT-2008-006. 2008. [ bib | list ]
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We present a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and the implementation of this approach, termed CrystalBall. Using CrystalBall we identified new bugs in mature Mace implementations of a random overlay tree and the Chord distributed hash table implementation. Furthermore, we show show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at run-time.

[70] Fractional Collections with Cardinality Bounds, and Mixed Integer Linear Arithmetic with Stars. Computer Science Logic (CSL). 2008. [ps]bib | list ]
We present decision procedures for logical constraints that support reasoning about collections of elements such as sets, multisets, and fuzzy sets. Element membership in such collections is given by a characteristic function from a finite universe (of unknown size) to a subset of rational numbers specified by user-defined constraints in mixed linear integer-rational arithmetic. Our logic supports standard operators such as union, intersection, difference, or any operation defined pointwise using mixed linear integer-rational arithmetic. Moreover, it supports the notion of cardinality of the collection. Deciding formulas in such logic has application in verification of data structures. Our decision procedure reduces satisfiability of formulas with collections to satisfiability of formulas in an extension of mixed linear integer-rational arithmetic with an “unbounded sum” operator. Such extension is also interesting in its own right because it can encode reachability problems for a simple class of transition systems. We present a decision procedure for the resulting extension, using a satisfiability-preserving transformation that eliminates the unbounded sum operator. Our decidability result subsumes previous special cases for sets and multisets.

[69] Linear Arithmetic with Stars. Computed-Aided Verification (CAV). 2008. [ps]bib | list ]
We consider an extension of integer linear arithmetic with a “star” operator takes closure under vector addition of the solution set of a linear arithmetic subformula. We show that the satisfiability problem for this extended language remains in NP (and therefore NP-complete). Our proof uses semilinear set characterization of solutions of integer linear arithmetic formulas, as well as a generalization of a recent result on sparse solutions of integer linear programming problems. As a consequence of our result, we present worst-case optimal decision procedures for two NP-hard problems that were previously not known to be in NP. The first is the satisfiability problem for a logic of sets, multisets (bags), and cardinality constraints, which has applications in verification, interactive theorem proving, and description logics. The second is the reachability problem for a class of transition systems whose transitions increment the state vector by solutions of integer linear arithmetic formulas.

[68] Full Functional Verification of Linked Data Structures. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2008. [ps]bib | list ]
We present the first verification of full functional correctness for a range of linked data structure implementations, including mutable lists, trees, graphs, and hash tables. Specifically, we present the use of the Jahob verification system to verify formal specifications, written in classical higher-order logic, that completely capture the desired behavior of the Java data structure implementations (with the exception of properties involving execution time and/or memory consumption). Given that the desired correctness properties include intractable constructs such as quantifiers, transitive closure, and lambda abstraction, it is a challenge to successfully prove the generated verification conditions. Our verification system uses 'integrated reasoning' to split each verification condition into a conjunction of simpler subformulas, then apply a diverse collection of specialized decision procedures, first-order theorem provers, and, in the worst case, interactive theorem provers to prove each subformula. Techniques such as replacing complex subformulas with stronger but simpler alternatives, exploiting structure inherently present in the verification conditions, and, when necessary, inserting verified lemmas and proof hints into the imperative source code make it possible to seamlessly integrate all of the specialized decision procedures and theorem provers into a single powerful integrated reasoning system. By appropriately applying multiple proof techniques to discharge different subformulas, this reasoning system can effectively prove the complex and challenging verification conditions that arise in this context.

[67] On Linear Arithmetic with Stars. EPFL Technical Report LARA-REPORT-2008-005. 2008. [ps]bib | list ]
We consider an extension of integer linear arithmetic with a star operator that takes closure under vector addition of the set of solutions of linear arithmetic subformula. We show that the satisfiability problem for this language is in NP (and therefore NP-complete). Our proof uses a generalization of a recent result on sparse solutions of integer linear programming problems. We present two consequences of our result. The first one is an optimal decision procedure for a logic of sets, multisets, and cardinalities that has applications in verification, interactive theorem proving, and description logics. The second is NP-completeness of the reachability problem for a class of “homogeneous” transition systems whose transitions are defined using integer linear arithmetic formulas.

[66] On Static Analysis for Expressive Pattern Matching. EPFL Technical Report LARA-REPORT-2008-004. 2008. [ps]bib | list ]
Pattern matching is a widespread programming language construct that enables definitions of values by cases, generalizing if-then-else and case statements. The cases in a pattern matching expression should be exhaustive: when the value does not match any of the cases, the expression throws a run-time exception. Similarly, each pattern should be reachable, and, if possible, patterns should be disjoint to facilitate reasoning. Current compilers use simple analyses to check patterns. Such analyses ignore pattern guards, use static types to approximate possible expression values, and do not take into account properties of user-defined functions. We present a design and implementation of a new analysis of pattern matching expressions. Our analysis detects a wider class of errors and reports fewer false alarms than previous approaches. It checks disjointness, reachability, and exhaustiveness of patterns by expressing these conditions as formulas and proving them using decision procedures and theorem provers. It achieves precision by propagating possible values through nested expressions and approximating pattern-matching guards with formulas. It supports user-defined “extractor” functions in patterns by relying on specifications of relationships between the domains of such functions. The result is the first analysis that enables verified, declarative pattern matching with guards in the presence of data abstraction. We have implemented our analysis and describe our experience in checking a range of pattern matching expressions in a subset of the Scala programming language.

[65] Decision Procedures for Multisets with Cardinality Constraints. 9th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). 2008. [ps]bib | list ]
Applications in software verification and interactive theorem proving often involve reasoning about sets of objects. Cardinality constraints on such collections also arise in these scenarios. Multisets arise for analogous reasons as sets: abstracting the content of linked data structure with duplicate elements leads to multisets. Interactive theorem provers such as Isabelle specify theories of multisets and prove a number of theorems about them to enable their use in interactive verification. However, the decidability and complexity of constraints on multisets is much less understood than for constraints on sets. The first contribution of this paper is a polynomial-space algorithm for deciding expressive quantifier-free constraints on multisets with cardinality operators. Our decision procedure reduces in polynomial time constraints on multisets to constraints in an extension of quantifier-free Presburger arithmetic with certain “unbounded sum” expressions. We prove bounds on solutions of resulting constraints and describe a polynomial-space decision procedure for these constraints. The second contribution of this paper is a proof that adding quantifiers to a constraint language containing subset and cardinality operators yields undecidable constraints. The result follows by reduction from Hilbert's 10th problem.

[64] Runtime Checking for Separation Logic. 9th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). 2008. [ bib | list ]
Separation logic is a popular approach for specifying properties of recursive mutable data structures. Several existing systems verify a subclass of separation logic specifications using static analysis techniques. Checking data structure specifications during program execution is an alternative to static verification: it can enforce the sophisticated specifications for which static verification fails, and it can help debug incorrect specifications and code by detecting concrete counterexamples to their validity. This paper presents Separation Logic Invariant ChecKer (SLICK), a runtime checker for separation logic specifications. We show that, although the recursive style of separation logic predicates is well suited for runtime execution, the implicit footprint and existential quantification make efficient runtime checking challenging. To address these challenges we introduce a coloring technique for efficiently checking method footprints and describe techniques for inferring values of existentially quantified variables. We have implemented our runtime checker in the context of a tool for enforcing specifications of Java programs. Our experience suggests that our runtime checker is a useful companion to a static verifier for separation logic specifications.

[63] Decision Procedures for Multisets with Cardinality Constraints. EPFL Technical Report LARA-REPORT-2007-002. 2007. [ bib | list ]
Applications in software verification and interactive theorem proving often involve reasoning about sets of objects. Cardinality constraints on such collections also arise in these applications. Multisets arise in these applications for analogous reasons as sets: abstracting the content of linked data structure with duplicate elements leads to multisets. Interactive theorem provers such as Isabelle specify theories of multisets and prove a number of theorems about them to enable their use in interactive verification. However, the decidability and complexity of constraints on multisets is much less understood than for constraints on sets. The first contribution of this paper is a polynomial-space algorithm for deciding expressive quantifier-free constraints on multisets with cardinality operators. Our decision procedure reduces in polynomial time constraints on multisets to constraints in an extension of quantifier-free Presburger arithmetic with certain “unbounded sum” expressions. We prove bounds on solutions of resulting constraints and describe a polynomial-space decision procedure for these constraints. The second contribution of this paper is a proof that adding quantifiers to a constraint language containing subset and cardinality operators yields undecidable constraints. The result follows by reduction from Hilbert's 10th problem.

[62] Runtime Checking for Separation Logic. EPFL Technical Report LARA-REPORT-2007-003. 2007. [ bib | list ]
Separation logic is a popular approach for specifying properties of recursive mutable data structures. Several existing systems verify a subclass of separation logic specifications using static analysis techniques. Checking data structure specifications during program execution is an alternative to static verification: it can enforce the sophisticated specifications for which static verification fails, and it can help debug incorrect specifications and code by detecting concrete counterexamples to their validity. This paper presents Separation Logic Invariant ChecKer (SLICK), a runtime checker for separation logic specifications. We show that, although the recursive style of separation logic predicates is well suited for runtime execution, the implicit footprint and existential quantification make efficient runtime checking challenging. To address these challenges we introduce a coloring technique for efficiently checking method footprints and describe techniques for inferring values of existentially quantified variables. We have implemented our runtime checker in the context of a tool for enforcing specifications of Java programs. Our experience suggests that our runtime checker is a useful companion to a static verifier for separation logic specifications.

[61] Towards Efficient Satisfiability Checking for Boolean Algebra with Presburger Arithmetic. Conference on Automateded Deduction (CADE-21). 2007. [ps]bib | list ]
Boolean Algebra with Presburger Arithmetic (BAPA) is a decidable logic that combines 1) Boolean algebra of sets of uninterpreted elements (BA) and 2) Presburger arithmetic (PA). BAPA can express relationships between integer variables and cardinalities of unbounded sets. In combination with other decision procedures and theorem provers, BAPA is useful for automatically verifying quantitative properties of data structures. This paper examines QFBAPA, the quantifier-free fragment of BAPA. The computational complexity of QFBAPA satisfiability was previously unknown; previous QFBAPA algorithms have non-deterministic exponential time complexity due to an explosion in the number of introduced integer variables. This paper shows, for the first time, how to avoid such exponential explosion. We present an algorithm for checking satisfiability of QFBAPA formulas by reducing them to formulas of quantifier-free PA, with only O(n log(n)) increase in formula size. We prove the correctness of our algorithm using a theorem about sparse solutions of integer linear programming problems. This is the first proof that QFBAPA satisfiability is in NP and therefore NP-complete. We implemented our algorithm in the context of the Jahob verification system. Our preliminary experiments suggest that our algorithm, although not necessarily better for proving formula unsatisfiability, is more effective in detecting formula satisfiability than previous approaches.

[60] Polynomial Constraints for Sets with Cardinality Bounds. Foundations of Software Science and Computation Structures (FOSSACS). 2007. [ps]bib | list ]
Logics that can reason about sets and their cardinality bounds are useful in program analysis, program verification, databases, and knowledge bases. This paper presents a class of constraints on sets and their cardinalities for which the satisfiability and the entailment problems are computable in polynomial time. Our class of constraints, based on tree-shaped formulas, is unique in being simultaneously tractable and able to express 1) that a set is a union of other sets, 2) that sets are disjoint, and 3) that a set has cardinality within a given range. As the main result we present a polynomial-time algorithm for checking entailment of our constraints.

[59] Verifying Complex Properties using Symbolic Shape Analysis. Workshop on Heap Abstraction and Verification (collocated with ETAPS). 2007. [ps]bib | list ]
[58] Runtime Checking for Program Verification. Workshop on Runtime Verification. 2007. [ bib | list ]
The process of verifying that a program conforms to its specification is often hampered by errors in both the program and the specification. A runtime checker that can evaluate formal specifications can be useful for quickly identifying such errors. This paper describes our preliminary experience with incorporating run-time checking into the Jahob verification system and discusses some lessons we learned in this process. One of the challenges in building a runtime checker for a program verification system is that the language of invariants and assertions is designed for simplicity of semantics and tractability of proofs, and not for run-time checking. Some of the more challenging constructs include existential and universal quantification, set comprehension, specification variables, and formulas that refer to past program states. In this paper, we describe how we handle these constructs in our runtime checker, and describe directions for future work.

[57] Modular Data Structure Verification. PhD Thesis, EECS Department, Massachusetts Institute of Technology. 2007. [ps]bib | http | list ]
This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java programs with dynamically allocated data structures. Developers write Jahob specifications in classical higher-order logic (HOL); Jahob reduces the verification problem to deciding the validity of HOL formulas. I present a new method for proving HOL formulas by combining automated reasoning techniques. My method consists of 1) splitting formulas into individual HOL conjuncts, 2) soundly approximating each HOL conjunct with a formula in a more tractable fragment and 3) proving the resulting approximation using a decision procedure or a theorem prover. I present three concrete logics; for each logic I show how to use it to approximate HOL formulas, and how to decide the validity of formulas in this logic. First, I present an approximation of HOL based on a translation to first-order logic, which enables the use of existing resolution-based theorem provers. Second, I present an approximation of HOL based on field constraint analysis, a new technique that enables decision procedures for special classes of graphs (such as monadic second-order logic over trees) to be applied to arbitrary graphs. Third, I present an approximation using Boolean Algebra with Presburger Arithmetic (BAPA), a logic that combines reasoning about sets of elements with reasoning about cardinalities of sets. BAPA can express relationships between sizes of data structures and invariants that correlate data structure size with integer variables. I present the first implementation of a BAPA decision procedure, and establish the exact complexity bounds for BAPA and quantifier-free BAPA. Together, these techniques enabled Jahob to modularly and automatically verify data structure implementations based on singly and doubly-linked lists, trees with parent pointers, priority queues, and hash tables. In particular, Jahob was able to prove that data structure implementations satisfy their specifications, maintain key data structure invariants expressed in a rich logical notation, and never produce run-time errors such as null dereferences or out of bounds accesses.

[56] Quantifier-Free Boolean Algebra with Presburger Arithmetic is NP-complete. MIT CSAIL Technical Report MIT-CSAIL-TR-2007-001. 2007. [ps]bib | http | list ]
Boolean Algebra with Presburger Arithmetic (BAPA) combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of unbounded finite sets and can be used to express verification conditions in verification of data structure consistency properties. In this report I consider the Quantifier-Free fragment of Boolean Algebra with Presburger Arithmetic (QFBAPA). Previous algorithms for QFBAPA had non-deterministic exponential time complexity. In this report I show that QFBAPA is in NP, and is therefore NP-complete. My result yields an algorithm for checking satisfiability of QFBAPA formulas by converting them to polynomially sized formulas of quantifier-free Presburger arithmetic. I expect this algorithm to substantially extend the range of QFBAPA problems whose satisfiability can be checked in practice.

[55] Using First-Order Theorem Provers in the Jahob Data Structure Verification System. Verification, Model Checking and Abstract Interpretation. 2007. [ps]bib | list ]
This paper presents our integration of efficient resolution-based theorem provers into the Jahob data structure verification system. Our experimental results show that this approach enables Jahob to automatically verify the correctness of a range of complex dynamically instantiable data structures, such as hash tables and search trees, without the need for interactive theorem proving or techniques tailored to individual data structures. Our primary technical results include: (1) a translation from higher-order logic to first-order logic that enables the application of resolution-based theorem provers and (2) a proof that eliminating type (sort) information in formulas is both sound and complete, even in the presence of a generic equality operator. Our experimental results show that the elimination of type information often dramatically decreases the time required to prove the resulting formulas. These techniques enabled us to verify complex correctness properties of Java programs such as a mutable set implemented as an imperative linked list, a finite map implemented as a functional ordered tree, a hash table with a mutable array, and a simple library system example that uses these container data structures. Our system verifies (in a matter of minutes) that data structure operations correctly update the finite map, that they preserve data structure invariants (such as ordering of elements, membership in appropriate hash table buckets, or relationships between sets and relations), and that there are no run-time errors such as null dereferences or array out of bounds accesses.

[54] On Using First-Order Theorem Provers in a the Jahob Data Structure Verification System. MIT Technical Report MIT-CSAIL-TR-2006-072. 2006. [ps]bib | http | list ]
This paper presents our integration of efficient resolution-based theorem provers into the Jahob data structure verification system. Our experimental results show that this approach enables Jahob to automatically verify the correctness of a range of complex dynamically instantiable data structures, including data structures such as hash tables and search trees, without the need for interactive theorem proving or techniques tailored to individual data structures. Our primary technical results include: (1) a translation from higher-order logic to first-order logic that enables the application of resolution-based theorem provers and (2) a proof that eliminating type (sort) information in formulas is both sound and complete, even in the presence of a generic equality operator. Our experimental results show that the elimination of type information dramatically decreases the time required to prove the resulting formulas. These techniques enabled us to verify complex correctness properties of Java programs such as a mutable set implemented as an imperative linked list, a finite map implemented as a functional ordered tree, a hash table with a mutable array, and a simple library system example that uses these container data structures. Our system verifies (in a matter of minutes) that data structure operations correctly update the finite map, that they preserve data structure invariants (such as ordering of elements, membership in appropriate hash table buckets, or relationships between sets and relations), and that there are no run-time errors such as null dereferences or array out of bounds accesses.

[53] Modular Pluggable Analyses for Data Structure Consistency. IEEE Transactions on Software Engineering (TSE). 2006. [ps]bib | http | list ]
Hob is a program analysis system that enables the focused application of multiple analyses to different modules in the same program. In our approach, each module encapsulates one or more data structures and uses membership in abstract sets to characterize how objects participate in data structures. Each analysis verifies that the implementation of the module 1) preserves important internal data structure consistency properties and 2) correctly implements a set algebra interface that characterizes the effects of operations on the data structure. Collectively, the analyses use the set algebra to 1) characterize how objects participate in multiple data structures and to 2) enable the inter-analysis communication required to verify properties that depend on multiple modules analyzed by different analyses. We implemented our system and deployed several pluggable analyses, including a flag analysis for modules in which abstract set membership is determined by a flag field in each object, a PALE shape analysis plugin, and a theorem proving plugin for analyzing arbitrarily complicated data structures. Our experience shows that our system can effectively 1) verify the consistency of data structures encapsulated within a single module and 2) combine analysis results from different analysis plugins to verify properties involving objects shared by multiple modules analyzed by different analyses.

[52] On Verifying Complex Properties using Symbolic Shape Analysis. Max-Planck Institute for Computer Science Technical Report MPI-I-2006-2-1. 2006. [ bib | http | list ]
One of the main challenges in the verification of software systems is the analysis of statically unbounded data structures with dynamic memory allocation, such as linked data structures and arrays. We describe Bohne, a new analysis for verifying data structures. Bohne verifies data structure operations and shows that 1) the operations preserve data structure invariants and 2) the operations satisfy their specifications expressed in terms of changes to the set of objects stored in the data structure. During the analysis, Bohne infers loop invariants in the form of disjunctions of universally quantified Boolean combinations of formulas, represented as sets of binary decision diagrams. To synthesize loop invariants of this form, Bohne uses a combination of decision procedures for Monadic Second-Order Logic over trees, SMT-LIB decision procedures (currently CVC Lite), and an automated reasoner within the Isabelle interactive theorem prover. This architecture shows that synthesized loop invariants can serve as a useful communication mechanism between different decision procedures. In addition, Bohne uses field constraint analysis, a combination mechanism that enables the use of uninterpreted function symbols within formulas of Monadic Second-Order Logic over trees. Using Bohne, we have verified operations on data structures such as linked lists with iterators and back pointers, trees with and without parent pointers, two-level skip lists, array data structures, and sorted lists. We have deployed Bohne in the Hob and Jahob data structure analysis systems, enabling us to combine Bohne with analyses of data structure clients and apply it in the context of larger programs. This paper describes the Bohne algorithm, the techniques that Bohne uses to reduce the amount of annotations and the running time of the analysis. We also describe the analysis architecture that enables Bohne to effectively take advantage of multiple reasoning procedures when proving complex invariants.

[51] Deciding Boolean Algebra with Presburger Arithmetic. Journal of Automated Reasoning. 2006. [ps]bib | http | list ]
We describe an algorithm for deciding the first-order multisorted theory BAPA, which combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of unbounded finite sets, and supports arbitrary quantification over sets and integers. Our original motivation for BAPA is deciding verification conditions that arise in the static analysis of data structure consistency properties. Data structures often use an integer variable to keep track of the number of elements they store; an invariant of such a data structure is that the value of the integer variable is equal to the number of elements stored in the data structure. When the data structure content is represented by a set, the resulting constraints can be captured in BAPA. BAPA formulas with quantifier alternations arise when verifying programs with annotations containing quantifiers, or when proving simulation relation conditions for refinement and equivalence of program fragments. Furthermore, BAPA constraints can be used for proving the termination of programs that manipulate data structures, as well as in constraint database query evaluation and loop invariant inference. We give a formal description of an algorithm for deciding BAPA. We analyze our algorithm and show that it has optimal alternating time complexity, and that the complexity of BAPA matches the complexity of PA. Because it works by a reduction to PA, our algorithm yields the decidability of a combination of sets of uninterpreted elements with any decidable extension of PA. When restricted to BA formulas, the algorithm can be used to decide BA in optimal alternating time. Furthermore, the algorithm can eliminate individual quantifiers from a formula with free variables and therefore perform projection onto a desirable set of variables. We have implemented our algorithm and used it to discharge verification conditions in the Jahob system for data structure consistency checking of Java programs; our experience suggest that a straightforward implementation of the algorithm is effective on non-trivial formulas as long as the number of set variables is small. We also report on a new algorithm for solving the quantifier-free fragment of BAPA.

[50] An overview of the Jahob analysis system: Project Goals and Current Status. NSF Next Generation Software Workshop. 2006. [ps]bib | list ]
We present an overview of the Jahob system for modular analysis of data structure properties. Jahob uses a subset of Java as the implementation language and annotations with formulas in a subset of Isabelle as the specification language. It uses monadic second-order logic over trees to reason about reachability in linked data structures, the Isabelle theorem prover and Nelson-Oppen style theorem provers to reason about high-level properties and arrays, and a new technique to combine reasoning about constraints on uninterpreted function symbols with other decision procedures. It also incorporates new decision procedures for reasoning about sets with cardinality constraints. The system can infer loop invariants using new symbolic shape analysis. Initial results in the use of our system are promising; we are continuing to develop and evaluate it.

[49] Field Constraint Analysis. Proc. Int. Conf. Verification, Model Checking, and Abstract Interpratation. 2006. [ps]bib | list ]
We introduce field constraint analysis, a new technique for verifying data structure invariants. A field constraint for a field is a formula specifying a set of objects to which the field can point. Field constraints enable the application of decidable logics to data structures which were originally beyond the scope of these logics, by verifying the backbone of the data structure and then verifying constraints on fields that cross-cut the backbone in arbitrary ways. Previously, such cross-cutting fields could only be verified when they were uniquely determined by the backbone, which significantly limits the range of analyzable data structures. Field constraint analysis permits non-deterministic field constraints on cross-cutting fields, which allows the verificiation of invariants for data structures such as skip lists. Non-deterministic field constraints also enable the verification of invariants between data structures, yielding an expressive generalization of static type declarations. The generality of our field constraints requires new techniques. We present one such technique and prove its soundness. We have implemented this technique as part of a symbolic shape analysis deployed in the context of the Hob system for verifying data structure consistency. Using this implementation we were able to verify data structures that were previously beyond the reach of similar techniques.

[48] On Field Constraint Analysis. MIT CSAIL Technical Report MIT-CSAIL-TR-2005-072, MIT-LCS-TR-1010. 2005. [ps]bib | list ]
We introduce field constraint analysis, a new technique for verifying data structure invariants. A field constraint for a field is a formula specifying a set of objects to which the field can point. Field constraints enable the application of decidable logics to data structures which were originally beyond the scope of these logics, by verifying the backbone of the data structure and then verifying constraints on fields that cross-cut the backbone in arbitrary ways. Previously, such cross-cutting fields could only be verified when they were uniquely determined by the backbone, which significantly limited the range of analyzable data structures. Our field constraint analysis permits non-deterministic field constraints on cross-cutting fields, which allows to verify invariants of data structures such as skip lists. Non-deterministic field constraints also enable the verification of invariants between data structures, yielding an expressive generalization of static type declarations. The generality of our field constraints requires new techniques, which are orthogonal to the traditional use of structure simulation. We present one such technique and prove its soundness. We have implemented this technique as part of a symbolic shape analysis deployed in the context of the Hob system for verifying data structure consistency. Using this implementation we were able to verify data structures that were previously beyond the reach of similar techniques.

[47] Implications of a Data Structure Consistency Checking System. International conference on Verified Software: Theories, Tools, Experiments (VSTTE, IFIP Working Group 2.3 Conference). 2005. [ps]bib | list ]
We present a framework for verifying that programs correctly preserve important data structure consistency properties. Results from our implemented system indicate that our system can effectively enable the scalable verification of very precise data structure consistency properties within complete programs. Our system treats both internal properties, which deal with a single data structure implementation, and external properties, which deal with properties that involve multiple data structures. A key aspect of our system is that it enables multiple analysis and verification packages to productively interoperate to analyze a single program. In particular, it supports the targeted use of very precise, unscalable analyses in the context of a larger analysis and verification system. The integration of different analyses in our system is based on a common set-based specification language: precise analyses verify that data structures conform to set specifications, whereas scalable analyses verify relationships between data structures and preconditions of data structure operations. There are several reasons why our system may be of interest in a broader program analysis and verification effort. First, it can ensure that the program satisfies important data structure consistency properties, which is an important goal in and of itself. Second, it can provide information that insulates other analysis and verification tools from having to deal directly with pointers and data structure implementations, thereby enabling these tools to focus on the key properties that they are designed to analyze. Finally, we expect other developers to be able to leverage its basic structuring concepts to enable the scalable verification of other program safety and correctness properties.

[46] Relational Analysis of Algebraic Datatypes. 10th European Soft. Eng. Conf. (ESEC) and 13th Symp. Foundations of Software Engineering (FSE). 2005. [ps]bib | list ]
We present a technique that enables the use of finite model finding to check the satisfiability of certain formulas whose intended models are infinite. Such formulas arise when using the language of sets and relations to reason about structured values such as algebraic datatypes. The key idea of our technique is to identify a natural syntactic class of formulas in relational logic for which reasoning about infinite structures can be reduced to reasoning about finite structures. As a result, when a formula belongs to this class, we can use existing finite model finding tools to check whether the formula holds in the desired infinite model.

[45] On Algorithms and Complexity for Sets with Cardinality Constraints. MIT CSAIL Technical Report 2005. [ps]bib | http | list ]
Typestate systems ensure many desirable properties of imperative programs, including initialization of object fields and correct use of stateful library interfaces. Abstract sets with cardinality constraints naturally generalize typestate properties: relationships between the typestates of objects can be expressed as subset and disjointness relations on sets, and elements of sets can be represented as sets of cardinality one. In addition, sets with cardinality constraints provide a natural language for specifying operations and invariants of data structures. Motivated by these program analysis applications, this paper presents new algorithms and new complexity results for constraints on sets and their cardinalities. We study several classes of constraints and demonstrate a trade-off between their expressive power and their complexity. Our first result concerns a quantifier-free fragment of Boolean Algebra with Presburger Arithmetic. We give a nondeterministic polynomial-time algorithm for reducing the satisfiability of sets with symbolic cardinalities to constraints on constant cardinalities, and give a polynomial-space algorithm for the resulting problem. The best previously existing algorithm runs in exponential space and nondeterministic exponential time. In a quest for more efficient fragments, we identify several subclasses of sets with cardinality constraints whose satisfiability is NP-hard. Finally, we identify a class of constraints that has polynomial-time satisfiability and entailment problems and can serve as a foundation for efficient program analysis. We give a system of rewriting rules for enforcing certain consistency properties of these constraints and show how to extract complete information from constraints in normal form. This result implies the soundness and completeness of our algorithms.

[44] An Algorithm for Deciding BAPA: Boolean Algebra with Presburger Arithmetic. 20th International Conference on Automated Deduction, CADE-20. 2005. [ps]bib | list ]
We describe an algorithm for deciding the first-order multisorted theory BAPA, which combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of a priory unbounded finite sets, and supports arbitrary quantification over sets and integers. Our motivation for BAPA is deciding verification conditions that arise in the static analysis of data structure consistency properties. Data structures often use an integer variable to keep track of the number of elements they store; an invariant of such a data structure is that the value of the integer variable is equal to the number of elements stored in the data structure. When the data structure content is represented by a set, the resulting constraints can be captured in BAPA. BAPA formulas with quantifier alternations arise when verifying programs with annotations containing quantifiers, or when proving simulation relation conditions for refinement and equivalence of program fragments. Furthermore, BAPA constraints can be used for proving the termination of programs that manipulate data structures, and have applications in constraint databases. We give a formal description of a decision procedure for BAPA, which implies the decidability of BAPA. We analyze our algorithm and obtain an elementary upper bound on the running time, thereby giving the first complexity bound for BAPA. Because it works by a reduction to PA, our algorithm yields the decidability of a combination of sets of uninterpreted elements with any decidable extension of PA. Our algorithm can also be used to yield an optimal decision procedure for BA through a reduction to PA with bounded quantifiers. We have implemented our algorithm and used it to discharge verification conditions in the Jahob system for data structure consistency checking of Java programs; our experience with the algorithm is promising.

[43] On Relational Analysis of Algebraic Datatypes. MIT Technical Report 985. 2005. [ps]bib | list ]
We present a technique that enables the use of finite model finding to check the satisfiability of certain formulas whose intended models are infinite. Such formulas arise when using the language of sets and relations to reason about structured values such as algebraic datatypes. The key idea of our technique is to identify a natural syntactic class of formulas in relational logic for which reasoning about infinite structures can be reduced to reasoning about finite structures. As a result, when a formula belongs to this class, we can use existing finite model finding tools to check whether the formula holds in the desired infinite model.

[42] Hob: A Tool for Verifying Data Structure Consistency. 14th International Conference on Compiler Construction (tool demo). 2005. [ps]bib | list ]
This tool demonstration presents Hob, a system for verifying data structure consistency for programs written in a general-purpose programming language. Our tool enables the focused application of multiple communicating static analyses to different modules in the same program. Using our tool throughout the program development process, we have successfully identified several bugs in both specifications and implementations of programs.

[41] Cross-Cutting Techniques in Program Specification and Analysis. 4th International Conference on Aspect-Oriented Software Development. 2005. [ps]bib | list ]
We present three aspect-oriented constructs (formats, scopes, and defaults) that, in combination with a specification language based on abstract sets of objects, enable the modular application of multiple arbitrarily precise (and therefore arbitrarily unscalable) analyses to scalably verify data structure consistency properties in sizable programs. Formats use a form of field introduction to group together the declarations of all of the fields that together comprise a given data structure. Scopes and defaults enable the developer to state certain data structure consistency properties once in a single specification construct that cuts across the preconditions and postconditions of the procedures in the system. Standard approaches, in contrast, scatter and duplicate such properties across the preconditions and postconditions. We have implemented a prototype implementation, specification, analysis, and verification system based on these constructs and used this system to successfully verify a range of data structure consistency properties in several programs. Most previous research in the field of aspect-oriented programming has focused on the use of aspect-oriented concepts in design and implementation. Our experience indicates that aspect-oriented concepts can also be extremely useful for specification, analysis, and verification.

[40] Decision Procedures for Set-Valued Fields. Electr. Notes Theor. Comput. Sci.; Proc. Abstract Interpretation of Object-Oriented Languages. 2005. [ps]bib | list ]
A central feature of current object-oriented languages is the ability to dynamically instantiate user-defined container data structures such as lists, trees, and hash tables. Implementations of these data structures typically use references to dynamically allocated objects, which complicates reasoning about the resulting program. Reasoning is simplified if data structure operations are specified in terms of abstract sets of objects associated with each data structure. For example, an insertion into a data structure in this approach becomes simply an insertion into a dynamically changing set-valued field of an object, as opposed to a manipulation of a dynamically linked structure attached to the object. In this paper we explore reasoning techniques for programs that manipulate data structures specified using set-valued abstract fields associated with container objects. We compare the expressive power and the complexity of specification languages based on 1) decidable prefix vocabulary classes of first-order logic, 2) two-variable logic with counting, and 3) Nelson-Oppen combinations of multisorted theories. Such specification logics can be used for verification of object-oriented programs with supplied invariants. Moreover, by selecting an appropriate subset of properties expressible in such logic, the decision procedures for these logics enable automated computation of lattice operations in an abstract interpretation domain, as well as automated computation of abstract program semantics.

[39] Generalized Typestate Checking for Data Structure Consistency. Verification, Model Checking and Abstract Interpretation. 2005. [ps]bib | list ]
We present an analysis to verify abstract set specifications for programs that use object field values to determine the membership of objects in abstract sets. In our approach, each module may encapsulate several data structures and use membership in abstract sets to characterize how objects participate in its data structures. Each module's specification uses set algebra formulas to characterize the effects of its operations on the abstract sets. The program may define abstract set membership in a variety of ways; arbitrary analyses (potentially with multiple analyses applied to different modules in the same program) may verify the corresponding set specifications. The analysis we present in this paper verifies set specifications by constructing and verifying set algebra formulas whose validity implies the validity of the set specifications. We have implemented our analysis and annotated several programs (75-2500 lines of code) with set specifications. We found that our original analysis algorithm did not scale; this paper describes several optimizations that improve the scalability of our analysis. It also presents experimental data comparing the original and optimized versions of our analysis.

[38] File Refinement. The Archive of Formal Proofs. 2004. [ bib | http | list ]
[37] Binary Search Trees. The Archive of Formal Proofs. 2004. [ bib | http | list ]
[36] Modular Pluggable Analyses for Data Structure Consistency. Monterey Workshop on Software Engineering Tools: Compatibility and Integration. 2004. [ bib | list ]
[35] On Decision Procedures for Set-Valued Fields. MIT CSAIL Technical Report 975. 2004. [ps]bib | list ]
An important feature of object-oriented programming languages is the ability to dynamically instantiate user-defined container data structures such as lists, trees, and hash tables. Programs implement such data structures using references to dynamically allocated objects, which allows data structures to store unbounded numbers of objects, but makes reasoning about programs more difficult. Reasoning about object-oriented programs with complex data structures is simplified if data structure operations are specified in terms of abstract sets of objects associated with each data structure. For example, an insertion into a data structure in this approach becomes simply an insertion into a dynamically changing set-valued field of an object, as opposed to a manipulation of a dynamically linked structure linked to the object. In this paper we explore reasoning techniques for programs that manipulate data structures specified using set-valued abstract fields associated with container objects. We compare the expressive power and the complexity of specification languages based on 1) decidable prefix vocabulary classes of first-order logic, 2) two-variable logic with counting, and 3) Nelson-Oppen combinations of multisorted theories. Such specification logics can be used for verification of object-oriented programs with supplied invariants. Moreover, by selecting an appropriate subset of properties expressible in such logic, the decision procedures for these logics yield automated computation of lattice operations in abstract interpretation domain, as well as automated computation of abstract program semantics.

[34] Combining Theorem proving with Static Analysis for Data Structure Consistency. International Workshop on Software Verification and Validation. 2004. [ps]bib | list ]
We describe an approach for combining theorem proving techniques with static analysis to analyze data structure consistency for programs that manipulate heterogeneous data structures. Our system uses interactive theorem proving and shape analysis to verify that data structure implementations conform to set interfaces. A simpler static analysis then uses the verified set interfaces to verify properties that characterize how shared objects participate in multiple data structures. We have successfully applied this technique to several programs and found that theorem proving within circumscribed regions of the program combined with static analysis enables the verification of large-scale program properties.

[33] On Spatial Conjunction as Second-Order Logic. MIT CSAIL Technical Report 970. 2004. [ps]bib | http | list ]
Spatial conjunction is a powerful construct for reasoning about dynamically allocated data structures, as well as concurrent, distributed and mobile computation. While researchers have identified many uses of spatial conjunction, its precise expressive power compared to traditional logical constructs was not previously known. In this paper we establish the expressive power of spatial conjunction. We construct an embedding from first-order logic with spatial conjunction into second-order logic, and more surprisingly, an embedding from full second order logic into first-order logic with spatial conjunction. These embeddings show that the satisfiability of formulas in first-order logic with spatial conjunction is equivalent to the satisfiability of formulas in second-order logic. These results explain the great expressive power of spatial conjunction and can be used to show that adding unrestricted spatial conjunction to a decidable logic leads to an undecidable logic. As one example, we show that adding unrestricted spatial conjunction to two-variable logic leads to undecidability. On the side of decidability, the embedding into second-order logic immediately implies the decidability of first-order logic with a form of spatial conjunction over trees. The embedding into spatial conjunction also has useful consequences: because a restricted form of spatial conjunction in two-variable logic preserves decidability, we obtain that a correspondingly restricted form of second-order quantification in two-variable logic is decidable. The resulting language generalizes the first-order theory of boolean algebra over sets and is useful in reasoning about the contents of data structures in object-oriented languages.

[32] On Our Experience with Modular Pluggable Analyses. MIT CSAIL Technical Report 965. 2004. [ps]bib | list ]
We present a technique that enables the focused application of multiple analyses to different modules in the same program. Our research has two goals: 1) to address the scalability limitations of precise analyses by focusing the analysis on only those parts of the program that are relevant to the properties that the analysis is designed to verify, and 2) to enable the application of specialized analyses that verify properties of specific classes of data structures to programs that simultaneously manipulate several different kinds of data structures. In our approach, each module encapsulates a data structure and uses membership in abstract sets to characterize how objects participate in its data structure. Each analysis verifies that the implementation of the module 1) preserves important internal data structure representation invariants and 2) conforms to a specification that uses formulas in a set algebra to characterize the effects of operations on the data structure. The analyses use the common set abstraction to 1) characterize how objects participate in multiple data structures and to 2) enable the inter-analysis communication required to verify properties that depend on multiple modules analyzed by different analyses. We characterize the key soundness property that an analysis plugin must satisfy to successfully participate in our system and present several analysis plugins that satisfy this property: a flag plugin that analyzes modules in which abstract set membership is determined by a flag field in each object, and a graph types plugin that analyzes modules in which abstract set membership is determined by reachability properties of objects stored in tree-like data structures.

[31] The First-Order Theory of Sets with Cardinality Constraints is Decidable. MIT CSAIL Technical Report 958. 2004. [ps]bib | http | list ]
We show that the decidability of the first-order theory of the language that combines Boolean algebras of sets of uninterpreted elements with Presburger arithmetic operations. We thereby disprove a recent conjecture that this theory is undecidable. Our language allows relating the cardinalities of sets to the values of integer variables, and can distinguish finite and infinite sets. We use quantifier elimination to show the decidability and obtain an elementary upper bound on the complexity. Precise program analyses can use our decidability result to verify representation invariants of data structures that use an integer field to represent the number of stored elements.

[30] Verifying a File System Implementation. Sixth International Conference on Formal Engineering Methods. 2004. [ps]bib | list ]
We present a correctness proof for a basic file system implementation. This implementation contains key elements of standard Unix file systems such as inodes and fixed-size disk blocks. We prove the implementation correct by establishing a simulation relation between the specification of the file system (which models the file system as an abstract map from file names to sequences of bytes) and its implementation (which uses fixed-size disk blocks to store the contents of the files). We used the Athena proof system to represent and validate our proof. Our experience indicates that Athena's use of block-structured natural deduction, support for structural induction and proof abstraction, and seamless integration with high-performance automated theorem provers were essential to our ability to successfully manage a proof of this size.

[29] Generalized Records and Spatial Conjunction in Role Logic. International Static Analysis Symposium. 2004. [ps]bib | list ]
We have previously introduced role logic as a notation for describing properties of relational structures in shape analysis, databases and knowledge bases. A natural fragment of role logic corresponds to two-variable logic with counting and is therefore decidable. We show how to use role logic to describe open and closed records, as well the dual of records, inverse records. We observe that the spatial conjunction operation of separation logic naturally models record concatenation. Moreover, we show how to eliminate the spatial conjunction of formulas of quantifier depth one in first-order logic with counting. As a result, allowing spatial conjunction of formulas of quantifier depth one preserves the decidability of two-variable logic with counting. This result applies to two-variable role logic fragment as well. The resulting logic smoothly integrates type system and predicate calculus notation and can be viewed as a natural generalization of the notation for constraints arising in role analysis and similar shape analysis approaches.

[28] On Generalized Records and Spatial Conjunction in Role Logic. MIT CSAIL Technical Report 942. 2004. [ps]bib | http | list ]
We have previously introduced role logic as a notation for describing properties of relational structures in shape analysis, databases and knowledge bases. A natural fragment of role logic corresponds to two-variable logic with counting and is therefore decidable. We show how to use role logic to describe open and closed records, as well the dual of records, inverse records. We observe that the spatial conjunction operation of separation logic naturally models record concatenation. Moreover, we show how to eliminate the spatial conjunction of formulas of quantifier depth one in first-order logic with counting. As a result, allowing spatial conjunction of formulas of quantifier depth one preserves the decidability of two-variable logic with counting. This result applies to two-variable role logic fragment as well. The resulting logic smoothly integrates type system and predicate calculus notation and can be viewed as a natural generalization of the notation for constraints arising in role analysis and similar shape analysis approaches.

[27] On Verifying a File System Implementation. MIT CSAIL Technical Report 946. 2004. [ps]bib | list ]
We present a correctness proof for a basic file system implementation. This implementation contains key elements of standard Unix file systems such as inodes and fixed-size disk blocks. We prove the implementation correct by establishing a simulation relation between the specification of the file system (which models the file system as an abstract map from file names to sequences of bytes) and its implementation (which uses fixed-size disk blocks to store the contents of the files). We used the Athena proof checker to represent and validate our proof. Our experience indicates that Athena's use of block-structured natural deduction, support for structural induction and proof abstraction, and seamless connection with high-performance automated theorem provers were essential to our ability to successfully manage a proof of this size.

[26] Boolean Algebra of Shape Analysis Constraints. Verification, Model Checking and Abstract Interpretation. 2004. [ps]bib | list ]
The parametric shape analysis framework of Sagiv, Reps, and Wilhelm [45, 46] uses three-valued structures as dataflow lattice elements to represent sets of states at different program points. The recent work of Yorsh, Reps, Sagiv, Wilhelm [48, 50] introduces a family of formulas in (classical, two-valued) logic that are isomorphic to three-valued structures [46] and represent the same sets of concrete states. In this paper we introduce a larger syntactic class of formulas that has the same expressive power as the formulas in [48]. The formulas in [48] can be viewed as a normal form of the formulas in our syntactic class; we give an algorithm for transforming our formulas to this normal form. Our formulas make it obvious that the constraints are closed under all boolean operations and therefore form a boolean algebra. Our algorithm also gives a reduction of the entailment and the equivalence problems for these constraints to the satisfiability problem.

[25] On computing the fixpoint of a set of boolean equations. Microsoft Research Technical Report MSR-TR-2003-08. 2003. [ps]bib | http | list ]
This paper presents a method for computing a least fixpoint of a system of equations over booleans. The resulting computation can be significantly shorter than the result of iteratively evaluating the entire system until a fixpoint is reached.

[24] On Modular Pluggable Analyses Using Set Interfaces. MIT CSAIL Technical Report 933. 2003. [ps]bib | list ]
We present a technique that enables the focused application of multiple analyses to different modules in the same program. Our research has two goals: 1) to address the scalability limitations of precise analyses by focusing the analysis on only those parts of the program that are relevant to the properties that the analysis is designed to verify, and 2) to enable the application of specialized analyses that verify properties of specific classes of data structures to programs that simultaneously manipulate several different kinds of data structures. In our approach, each module encapsulates a data structure and uses membership in abstract sets to characterize how objects participate in its data structure. Each analysis verifies that the implementation of the module 1) preserves important internal data structure representation invariants and 2) conforms to a specification that uses formulas in a set algebra to characterize the effects of operations on the data structure. The analyses use the common set abstraction to 1) characterize how objects participate in multiple data structures and to 2) enable the inter-analysis communication required to verify properties that depend on multiple modules analyzed by different analyses. We characterize the key soundness property that an analysis plugin must satisfy to successfully participate in our system and present several analysis plugins that satisfy this property: a flag plugin that analyzes modules in which abstract set membership is determined by a flag field in each object, and a graph types plugin that analyzes modules in which abstract set membership is determined by reachability properties of objects stored in tree-like data structures.

[23] Generalized Typestate Checking Using Set Interfaces and Pluggable Analyses. SIGPLAN Notices. 2004. [ps]bib | list ]
We present a generalization of standard typestate systems in which the typestate of each object is determined by its membership in a collection of abstract typestate sets. This generalization supports typestates that model participation in abstract data types, composite typestates that correspond to membership in multiple sets, and hierarchical typestates. Because membership in typestate sets corresponds directly to participation in data structures, our typestate system characterizes global sharing patterns. In our approach, each module encapsulates a data structure and uses membership in abstract sets to characterize how objects participate in its data structure. Each analysis verifies that the implementation of the module 1) preserves important internal data structure representation invariants and 2) conforms to a specification that uses formulas in a set algebra to characterize the effects of operations on the data structure. The analyses use the common set abstraction to 1) characterize how objects participate in multiple data structures and to 2) enable the inter-analysis communication required to verify properties that depend on multiple modules analyzed by different analyses.

[22] On Role Logic. MIT CSAIL Technical Report 925. 2003. [ps]bib | http | list ]
We present role logic, a notation for describing properties of relational structures in shape analysis, databases, and knowledge bases. We construct role logic using the ideas of de Bruijn's notation for lambda calculus, an encoding of first-order logic in lambda calculus, and a simple rule for implicit arguments of unary and binary predicates. The unrestricted version of role logic has the expressive power of first-order logic with transitive closure. Using a syntactic restriction on role logic formulas, we identify a natural fragment RL2 of role logic. We show that the RL2 fragment has the same expressive power as two-variable logic with counting C2, and is therefore decidable. We present a translation of an imperative language into the decidable fragment RL2, which allows compositional verification of programs that manipulate relational structures. In addition, we show how RL2 encodes boolean shape analysis constraints and an expressive description logic.

[21] On the Boolean Algebra of Shape Analysis Constraints. MIT CSAIL Technical Report 2003. [ps]bib | list ]
Shape analysis is a promising technique for statically verifying and extracting properties of programs that manipulate complex data structures. We introduce a new characterization of constraints that arise in parametric shape analysis based on manipulation of three-valued structures as dataflow facts. We identify an interesting syntactic class of first-order logic formulas that captures the meaning of three-valued structures under concretization. This class is broader than previously introduced classes, allowing for a greater flexibility in the formulation of shape analysis constraints in program annotations and internal analysis representations. Three-valued structures can be viewed as one possible normal form of the formulas in our class. Moreover, we characterize the meaning of three-valued structures under “tight concretization”. We show that the seemingly minor change from concretization to tight concretization increases the expressive power of three-valued structures in such a way that the resulting constraints are closed under all boolean operations. We call the resulting constraints boolean shape analysis constraints. The main technical contribution of this paper is a natural syntactic characterization of boolean shape analysis constraints as arbitrary boolean combinations of first-order sentences of certain form, and an algorithm for transforming such boolean combinations into the normal form that corresponds directly to three-valued structures. Our result holds in the presence of arbitrary shape analysis instrumentation predicates. The result enables the reduction (without any approximation) of the entailment and the equivalence of shape analysis constraints to the satisfiability of shape analysis constraints. When the satisfiability of the constraints is decidable, our result implies that the entailment and the equivalence of the constraints are also decidable, which enables the use of constraints in a compositional shape analysis with a predictable behavior.

[20] Structural Subtyping of Non-Recursive Types is Decidable. Eighteenth Annual IEEE Symposium on Logic in Computer Science (LICS). 2003. [ps]bib | list ]
We show that the first-order theory of structural subtyping of non-recursive types is decidable, as a consequence of a more general result on the decidability of term powers of decidable theories. Let Σ be a language consisting of function symbols and let C (with a finite or infinite domain C) be an L-structure where L is a language consisting of relation symbols. We introduce the notion of Σ-term-power of the structure C, denoted PΣ(C). The domain of PΣ(C) is the set of Σ-terms over the set C. PΣ(C) has one term algebra operation for each f in Σ, and one relation for each r inL defined by lifting operations of C to terms over C. We extend quantifier elimination for term algebras and apply the Feferman-Vaught technique for quantifier elimination in products to obtain the following result. Let K be a family of L-structures and KP the family of their Σ-term-powers. Then the validity of any closed formula F on KP can be effectively reduced to the validity of some closed formula q(F) on K. Our result implies the decidability of the first-order theory of structural subtyping of non-recursive types with covariant constructors, and the construction generalizes to contravariant constructors as well.

[19] Existential Heap Abstraction Entailment is Undecidable. 10th Annual International Static Analysis Symposium. 2003. [ps]bib | list ]
In this paper we study constraints for specifying properties of data structures consisting of linked objects allocated in the heap. Motivated by heap summary graphs in role analysis and shape analysis we introduce the notion of regular graph constraints. A regular graph constraint is a graph representing the heap summary; a heap satisfies a constraint if and only if the heap can be homomorphically mapped to the summary. Regular graph constraints form a very simple and natural fragment of the existential monadic second-order logic over graphs. One of the key problems in a compositional static analysis is proving that procedure preconditions are satisfied at every call site. For role analysis, precondition checking requires determining the validity of implication, i.e., entailment of regular graph constraints. The central result of this paper is the undecidability of regular graph constraint entailment. The undecidability of the entailment problem is surprising because of the simplicity of regular graph constraints: in particular, the satisfiability of regular graph constraints is decidable. Our undecidability result implies that there is no complete algorithm for statically checking procedure preconditions or postconditions, simplifying static analysis results, or checking that given analysis results are correct. While incomplete conservative algorithms for regular graph constraint entailment checking are possible, we argue that heap specification languages should avoid second-order existential quantification in favor of explicitly specifying a criterion for summarizing objects.

[18] In-Place Refinement for Effect Checking. Workshop on Automated Verification of Infinite-State Systems. 2003. [ps]bib | list ]
The refinement calculus is a powerful framework for reasoning about programs, specifications, and refinement relations between programs and specifications. In this paper we introduce a new refinement calculus construct, in-place refinement. We use in-place refinement to prove the correctness of a technique for checking the refinement relation between programs and specifications. The technique is applicable whenever the specification is an idempotent predicate transformer, as is the case for most procedure effects. In-place refinement is a predicate on the current program state. A command in-place refines a specification in a given state if the effect of every execution of the command in the state is no worse then the effect of some execution of the specification in the state. We demonstrate the usefulness of the in-place refinement construct by showing the correctness of a precise technique for checking effects of commands in a computer program. The technique is precise because it takes into account the set of possible states in which each command can execute, using the information about the control-flow and expressions of conditional commands. This precision is particularly important for handling aliasing in object-oriented programs that manipulate dynamically allocated data structures. We have implemented the technique as a part of a side-effect checker for the programming language C#.

[17] On the Theory of Structural Subtyping. MIT LCS Technical Report 879. 2003. [ps]bib | http | list ]
We show that the first-order theory of structural subtyping of non-recursive types is decidable. Let Σ be a language consisting of function symbols (representing type constructors) and C a decidable structure in the relational language L containing a binary relation <=. C represents primitive types; <= represents a subtype ordering. We introduce the notion of Σ-term-power of C, which generalizes the structure arising in structural subtyping. The domain of the Σ-term-power of C is the set of Σ-terms over the set of elements of C. We show that the decidability of the first-order theory of C implies the decidability of the first-order theory of the Σ-term-power of C. Our decision procedure makes use of quantifier elimination for term algebras and Feferman-Vaught theorem. Our result implies the decidability of the first-order theory of structural subtyping of non-recursive types.

[16] Typestate Checking and Regular Graph Constraints. MIT LCS Technical Report 863. 2002. [ps]bib | http | list ]
We introduce regular graph constraints and explore their decidability properties. The motivation for regular graph constraints is 1) type checking of changing types of objects in the presence of linked data structures, 2) shape analysis techniques, and 3) generalization of similar constraints over trees and grids. We define a subclass of graphs called heaps as an abstraction of the data structures that a program constructs during its execution. We prove that determining the validity of implication for regular graph constraints over the class of heaps is undecidable. We show undecidability by exhibiting a characterization of certain corresponder graphs in terms of presence and absence of homomorphisms to a finite number of fixed graphs. The undecidability of implication of regular graph constraints implies that there is no algorithm that will verify that procedure preconditions are met or that the invariants are maintained when these properties are expressed in any specification language at least as expressive as regular graph constraints.

[15] Role Analysis. ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL). 2002. [ps]bib | list ]
We present a new role system in which the type (or role) of each object depends on its referencing relationships with other objects, with the role changing as these relationships change. Roles capture important object and data structure properties and provide useful information about how the actions of the program interact with these properties. Our role system enables the programmer to specify the legal aliasing relationships that define the set of roles that objects may play, the roles of procedure parameters and object fields, and the role changes that procedures perform while manipulating objects. We present an interprocedural, compositional, and context-sensitive role analysis algorithm that verifies that a program maintains role constraints.

[14] Designing an Algorithm for Role Analysis. Master's Thesis, MIT LCS. 2001. [ps]bib | list ]
This thesis presents a system for specifying constraints on dynamically changing referencing relationships of heap objects, and an analysis for static verification of these constraints. The constraint specification system is based on the concept of role. The role of an object depends, in large part, on its aliasing relationships with other objects, with the role of each object changing as its aliasing relationships change. In this way roles capture object and data structure properties such as unique references, membership of objects in data structures, disjointness of data structures, absence of representation exposure, bidirectional associations, treeness, and absence or presence of cycles in the heap. Roles generalize linear types by allowing multiple aliases of heap objects that participate in recursive data structures. Unlike graph grammars and graph types, roles contain sufficiently general constraints to conservatively approximate any data structure. We give a semantics for mutually recursive role definitions and derive properties of roles as an invariant specification language. We introduce a programming model that allows temporary violations of role constraints. We describe a static role analysis for verifying that a program conforms to the programming model. The analysis uses fixpoint computation to synthesize loop invariants in each procedure. We introduce a procedure interface specification language and its semantics. We present an interprocedural, compositional, and context-sensitive role analysis that verifies that a program respects the role constraints across procedure calls.

[13] Roles are really great!. MIT LCS Technical Report 822. 2001. [ps]bib | list ]
We present a new role system for specifying changing referencing relationships of heap objects. The role of an object depends, in large part, on its aliasing relationships with other objects, with the role of each object changing as its aliasing relationships change. Roles therefore capture important object and data structure properties and provide useful information about how the actions of the program interact with these properties. Our role system enables the programmer to specify the legal aliasing relationships that define the set of roles that objects may play, the roles of procedure parameters and object fields, and the role changes that procedures perform while manipulating objects. We present an interprocedural, compositional, and context-sensitive role analysis algorithm that verifies that a program respects the role constraints.

[12] A Language for Role Specifications. Workshop on Languages and Compilers for Parallel Computing. 2001. [ps]bib | list ]
This paper presents a new language for identifying the changing roles that objects play over the course of the computation. Each object's points-to relationships with other objects determine the role that it currently plays. Roles therefore reflect the object's membership in specific data structures, with the object's role changing as it moves between data structures. We provide a programming model which allows the developer to specify the roles of objects at different points in the computation. The model also allows the developer to specify the effect of each operation at the granularity of role changes that occur in identified regions of the heap.

[11] Object Models, Heaps, and Interpretations. MIT LCS Technical Report 816. 2001. [ps]bib | list ]
This paper explores the use of object models for specifying verifiable heap invariants. We define a simple language based on sets and relations and illustrate its use through examples. We give formal semantics of the language by translation into predicate calculus and interpretation of predicates in terms of objects and references in the program heap.

[10] Numerical Representations as Purely Functional Data Structures: A New Approach. INFORMATICA, Institute of Mathematics and Informatics, Vilnius. 2002. [ bib | list ]
[9] Types and confluence in lambda calculus. 3rd Panhellenic Logic Symposium. 2001. [ bib | list ]
[8] Confluence of untyped lambda calculus via simple types. Proceedings of the 7th Italian Conference on Theoretical Computer Science, ICTCS 2001. 2001. [ bib | list ]
[7] Reducibility method for termination properties of typed lambda terms. Fifth International Workshop on Termination. 2001. [ bib | list ]
[6] Modular Language Specifications in Haskell. Theoretical Aspects of Computer Science with practical application. 2000. [ps]bib | list ]
We propose a framework for specification of programming language semantics, abstract and concrete syntax, and lexical structure. The framework is based on Modular Monadic Semantics and allows independent specification of various language features. Features such as arithmetics, conditionals, exceptions, state and nondeterminism can be freely combined into working interpreters, facilitating experiments in language design. A prototype implementation of this system in Haskell is described and possibilities for more sophisticated interpreter generator are outlined.

[5] Modular Interpreters in Haskell. B.Sc. Thesis, University of Novi Sad. 2000. [ps]bib | list ]
[4] Numerical Representations as Purely Functional Data Structures. XIV Conference on Applied Mathematics PRIM, Palić. 2000. [ bib | list ]
[3] Reducibility Method in Simply Typed Lambda Calculus. XIV Conference on Applied Mathematics PRIM, Palić. 2000. [ bib | list ]
[2] Developing a Multigrid Solver for Standing Wave Equation. Proceedings of the 28th Dr. Bessie F. Lawrence International Summer Science Institute Participants, Weizmann Institute of Science. 1996. [ps]bib | list ]
In this paper multigrid technique is adapted for solving standing one-dimensional wave equation with radiation boundary conditions. Solver, consisting of wave cycle and ray cycle, uses Gauss-Seidel and Kaczmarz relaxation sweeps and is aimed to work efficiently for all error components.

[1] PLS: Programming Language for Simulations. Proceedings of the Petnica Science Center Seminar '93. 1993. [ bib | list ]
I describe design and implementation of a programming language PLS for writing simulations of physical processes. PLS supports descriptions of dynamically changing components of systems and simplifies the specifications of interactions between multiple objects. The language is implemented as a translator from Pascal. I give EBNF syntax of the language, the translation rules from PLS to Pascal, and present an example of a PLS program for simulation of n-body gravitational interaction.