FunCheck: The Cure to Everything(TM)

Repository

http://code.google.com/p/scalapatterns/source/browse/#svn/matcheck/trunk

http://scalapatterns.googlecode.com/svn/matcheck/

New SVN repository for the future compiler plugin:

https://polysvn.ch/repos/funcheck/

Practical Connection to Scala Compiler Implementation

Extension practical enough that it can soon be incorporated into Scala release.

Pure Scala

Now on its own page: Pure Scala.

Contracts

First order contracts for now. If there are no unreasonable subtleties, make them work for higher-order case as well.

Convince LAMP to add proper notation for contracts to Scala.

in fact, they can be compiled to .NET directly, which now supports them

Universally quantified assertions as generalizations of contracts

Online static analysis and testing for universally quantified assertions

the role of partial evaluation, efficient compilation of checks as opposed to just true/false/unknown!

Weakening of universally quantified assertions to runtime values

using run-time evaluation on these values to guide static proofs after program runs

Introduce notion of explicit error condition and ordering on programs (worse program has more errors), as in pure weakest preconditions. It is sufficient to verify the worse program. Consider the following transformations:

turning implication of postcondition into explicit precondition: this causes more failures, makes program “worse”, the difference is in the case where the precondition does not hold
turning contract with only postcondition into univerally quantified assertion: this makes program worse, because we require it to hold not only for the arguments on which function is actually called, but for all values
adding universally quantified assertions over enumerable data types (integers, lists, trees) does not change recursive enumerability of the verification problem (run all assertions in parallel)
it makes sense to replace the universally quantified statements with quantification only over values occurring within the program
- in function contract checking, it is not even over all arguments ocurring, but only over those on which functions are called
- this notion generalizes to arbitrary assertions if we have the right runtime data structures that keep track of n-tuples of calls to functions; these data structures seem related to memoization and multiple dispatch
- we could retain the completeness of universal quantifiers if instead of restricting to the values encountered, we merely prioritize towards these values in some fair maner; verification thus becomes test prioritization
- testing over values encountered can be used to guide the proof and at some point discharge infinite many tests at once

Invariants

Class invariants.

For now, only invariants that mention pure state:

checked at construction time
assumed at pattern matching time

I would suggest introducing specially named method 'inv' to denote them.

ScalaCheck Generators

ScalaCheck Generators describes how we will be automatically inject ScalaCheck-compliant generators into the program.

We can then describe global invariants properties using a handy and easy to understand syntax, declared in the Scala language itself. This wil de facto replace the need of having a special syntax for invariants.

Specification Variables

They look exactly like ordinary variables, but are removed at run-time (except perhaps for debugging purposes).

Perhaps the indication that they are specification variables and should be removed can be indicated by standard Scala annotation.

They can appear in invariants and can be used to eliminate induction and quantifiers from verification conditions. See Viktor's PLDI'08 and PLDI'09 papers, as well as EMIC experience in Hypervisor.

Places where we would like them:

fields (most important)
parameters (useful)
local variables (useful)

A specification variable analysis detects whether variables affect the control flow (perhaps modulo termination). In general we should be able to write

specVar = pureFunctionInvocation(x,y,z)

that is, we can have complex updates to specification variables given by pure code.

Taint analysis for specification variables and checking the soundness conditions.

Vienna Development Method (VDM):

VDM Book

Abstract Classes, Specification Vars, and Algebraic Specs

If I think about the right place for specification variables, they should live in something like abstract classes (or traits?).

I see two solutions.

A simple one is where the abstract class comes with axioms that the operations should satisfy. When we implement a trait, the system attempts to prove that the axioms hold. This is very elegant especially for mathematical structures and Haskell-style type classes.

The second solution, which will probably be what we will do for mutable data structures, is that an abstract class comes with a specification variable, as well as a set of contracts that on operations, expressed in terms of these specification variables. Then these contracts need to be respected in the subclass.

Advantages of Algebraic Specs

Properties relating multiple functions:

specify extractors, achieve abstraction: export names of abstraction functions and the properties
express e.g. properties of the equality symbol
- automation of obligations for proper definition of congruences (user-defined equality)

Precise Verification Conditions

To report true errors for unhandled patterns, we need to be able to find true counterexamples.

Can we create verification conditions with the property that, if there exists a counterexample to the given specification, then the given formula has a counterexample? For safety properties (and taking into account boundedness of integers), this should be possible.

The reason is that counterexamples to such properties can be revealed by bounded-exhaustive testing: if there exists a counterexample, then there exists an execution that starts from some finite state and takes some path, evaluating the intermediate assertions to true and evaluating the final assertion to false. Therefore, there should be an existential formula whose satisfying assignment descrbies the error trace.

However, if procedure summaries are not precise, then the counterexample we find need not be real.

Approaches:

use Kodkod and EBU formulas
prove that finite instantiation is sufficient and use z3 on quantifier-free formulas
gradual grounding: gradually explore (as in systematic testing) possible values of algebraic data types, ensuring that the values of recursive function calls can be computed. Direct unrolling of functions. Use a combination of z3 w/ push/pop and ground evaluation where needed.

Non-Contract Specifications

It is important to realize that the specifications in the original Matcheck project went beyond pre-post conditions, because of the support for extractors. Namely, extractor specifications were relating multiple functions, in the style of algebraic specifications as in the

Larch Verification System

This is one novelty of this work compared to other recent work in verification, which could not naturally handle relationships between different methods.

For matcheck-style specifications, it makes a lot of sense to automatically generate them.

Another class of specifications are properties of equals methods:

reflexivity
symmetry
transitivity
congruence with respect to operations, e.g.

forall ((t1:Tree,t2:Tree,x:E) => (!equals(t1,t2) || equals(add(t1,x), add(t2,x))))

Modeling Recusion

If we can syntactically show it terminates, we generate appropriate equational axioms.

Otherwise, we use partial functions. Two possible models:

use option types - looks like we will use it a lot anyways
use relations - useful if we extend this to state

Proving Formulas: Completeness and Efficiency

Counterexample search is essential, but it should work well using Kodkod

in absence of integers
in absence of universally quantified axioms for refinement types

Ali made the TPTP interface

use it to run E
use it to run Paradox: apply results on EBU formulas to get counterexamples for infinite models

What can we hope to get to work:

two-variable logic with counting (C2)
- my paper pointer and reports on using such constraints, see also my one and only POPL paper
- this can handle arbitrary graph structures and replace statement 'objects of type A point along field f to objects of type B' with statements 'objects stored in container Ac point along field f to objects stored in container Bc'
- in practice use TPTP provers to prove C2 formulas, or use description logic implementation by Ian Horrocks (ask Ruzica and Abhinav)
recursive properties of trees (type refinements, e.g. eliminate implication from tree)
- use MONA or tree automata directly (Abhinav implements them)
- guarded fixpoint logics (see Graedel'03 overview paper): they allow recursion and they work on arbitrary graphs, because they are simple enough to admit tree model property; for implementation perhaps check Pierre Geneves
see whether we need to combine logics (as in Wies et al FROCOS submission, perhaps not, but if there is a use for it, send examples to Ruzica)

Recent thoughts:

logics with of terms with subterms (K. N. Venkataraman), and its with function images to prove equalities of data structure contents
parametric logic of trees with content (no subterms), its decidability

Important modeling questions (how to generate formulas from programs):

what should be the right model for case classes
- non-sealed case classes
- equality
- user-defined extractors
related to above, can we create reasonable models where we can use decision procedures that generalize algebraic data types and
- we can model non-sealed classes using sealed classes with an implicit 'Rest' class (if someone adds cases, they implicitly become subclasses of this one)
- is this already covered by DPLL(UF,CONS) or do we need to generalize the underlying techniques?

Can we come up with decidable class of constraints that handles a mix of case classes and standard classes, and it supports type refinements? Orthogonally, can we make it work with partial functions?

Further Topics

TODO list:

examine the benchmarks from the old version, look at verification conditions
- connect it to new prover infrastructure, generate counterexamples
- try the .SMT examples on Z3, Spass

interesting example where typed channels would bring something to actors.
get a compiler plugin stub working
figure out if there's anything to do with partial functions
look at formulas for valid and invalid cases (rethink generation of formulas)