LARA – Lab for Automated Reasoning and Analysis -

Pattern matching project

SVN

All sources are available from a publicly accessible SVN repository.

Valuable Papers

[1] T. Millstein, Practical Predicate Dispatch, OOPSLA'04, Oct. 24-28 2004

[2] M. Pettersson, A Term Pattern-Match Compiler Inspired by Finite Automata Theory, International Workshop on Compiler Constrution in 1992(CC'92), Springer-Verlag LNCS-641

[3] A. Kennedy and C.V. Russo, A Generalized Algebraic Data Types and Object-Oriented Programming, OOPSLA'05 , Oct. 16-20 2005

[4] H. Xi, C. Chen and G. Chen, Guarded Recursive Datatype Constructors, POPL'03 , Jan. 15-17 2003

[5] J. Liu and A.C. Myers, JMatch: Java plus Pattern Matching

[6] B. Emir, M. Odersky and J. Williams, Matching Objects With Patterns, LAMP-REPORT-2006-006

[7] B. Emir, Q. Ma and M. Odersky, Translation Correctness for First-Order Object-Oriented Pattern Matching, LAMP-REPORT-2007-003

[8] C. Kirchner, P. Moreau and A. Reilles, Formal Validation of Pattern Matching Code, PPDP'05, July 11-13

[9] P. Wadler, Views: A way for pattern matching to cohabit with data abstraction, 1986

[10] T. Freeman and F. Pfenning, Refinement types for ML, ACM SIGPLAN ’91, June 26-28 1991

[11] M. Fähndrich, K. Rustan and M. Leino, Declaring and Checking Non-null Types in an Object-Oriented Language, OOPSLA'03, Oct. 26-30 2003

[12] http://blogs.msdn.com/dsyme/archive/2007/04/07/draft-paper-on-f-active-patterns.aspx

[13] A Logic for Miranda, revisited

[14] ATS Dependent Types

[15] Pattern matching with dependent types

Brief Summary

[11] They introduce a notation to distinguish nullable and not-nullable objects. It is shown how JAVA/C# type system could be refined in order to allow this extension. It is interesting for us because of the (particular) way null values are actually deal with in Scala for a PM expression.
The current specification of our language doesn't take in consideration the null value but we may think to extend it with a similar notation to provide a finer reasoning about nullable objects.

[12] Introduces translation from a pure functional language Miranda into Isabelle. The translation includes handling of pattern matching.

Scala code snippets

Fun with extractors

Side-effects

This example feels complete, but will break because of the side-effect on the (variable) value: cells with odd numbers will not match any of the cases, as the first call to the extractor will turn them into even numbers.

class Cell(var value: Int)
 
object Cell {
  def unapply(cell: Cell) = {
    cell.value += 1
    Some(cell.value)
  }
}
 
object Cells extends Application {
  val myCell = new Cell(3)
 
  Console.println(myCell.value + " is " + (myCell match {
    /* It looks complete, but side-effects will kill it. */
    case Cell(value) if value % 2 != 0 => "odd"
    case Cell(value) if value % 2 == 0 => "even"
  }))
}

Extractor and ADT

This other example is more of a warning: even though the first pattern looks like it will only apply to FancyCells, it will match the regular Cell because the extractor is defined as such in the FancyCell object. As a result, we cannot trust the objects to always behave as if they were defined to match only the corresponding class's instances.

class Cell(val value: Int)
 
object Cell {
  def unapply(cell: Cell) = Some(cell.value)
}
 
class FancyCell(override val value: Int, val name: String) extends Cell(value)
 
object FancyCell {
  def unapply(cell: Cell) = Some(cell.value)
}
 
object MoreCells extends Application {
  val myCell = new Cell(3)
 
  Console.println("I found " + (myCell match {
    /* Watch out, this first line is going to match even a Cell */
    case FancyCell(3) => "a fancy three."
    case Cell(3) => "a three." /* Never attained for the current definition */
    case _ => "nothing."
  }))
}

And it is interesting to compare the previous code with the following slightly modified. Notice that now the FancyCell is a case class (hence it defines a classic ADT). In this situation the case Cell(3) is attained!

class Cell(val value: Int)
 
object Cell {
  def unapply(cell: Cell) = Some(cell.value)
}
 
case class FancyCell(override val value: Int) extends Cell(value)
 
 
object MoreCells extends Application {
  val myCell = new Cell(3)
 
  Console.println("I found " + (myCell match {
    case FancyCell(3) => "a fancy three."
    case Cell(3) => "a three." /* Now, without it, we would obtain a MatchError */
    case _ => "nothing."
  }))
}

Null in Node unapply extractor method

This example has little interest in itself, in fact none would conscientiously mess the Node extractor method with null values. Although, it does show that the usual definition of completeness for pattern matching should be refined if we want to take in consideration any possible behaviour. Thus, is something to keep in mind the moment we will start thinking to work on a scalac plugin.

/* class hierarchy */
abstract class Tree {
  def isNode: Boolean = 
  /* it seems complete, but in the Node extractor method we mess with null. 
     A MatchError is thrown because the pattern Node(l,v,null) is not matched by any case's */  
  this match {
    case Node(l,r: Tree) => true
    case EmptyTree() => false
  }
}
class Node(val left: Tree, val right: Tree) extends Tree
class EmptyTree extends Tree
 
 
/* extractors ...*/
object Node {
  def apply(left: Tree, right: Tree) = new Node(left,right)
 
  def unapply(node: Node) = {
    val myNode: Tree = null
    Some((node.left,myNode))
  }
 
} 
 
object EmptyTree {
  def apply() = new EmptyTree()
  def unapply(empty: EmptyTree): Boolean = true
}

Note that this problem is independent of the fact that we have used extractor. In fact the following definition (that uses only ADT) is sensible to the very same problem, that is (for example) a null right branch of a node will determine a MatchError.

/* ADT class hierarchy */
abstract class Tree {
  def isNode: Boolean = 
  /* it seems complete, but in the Node extractor method we mess with null. 
     A MatchError is thrown because the pattern Node(l,v,null) is not matched by any case's */  
  this match {
    case Node(l,r: Tree) => true
    case EmptyTree() => false
  }
}
case class Node(val left: Tree, val right: Tree) extends Tree
case class EmptyTree extends Tree

The null value in Scala has type scala.Null <: scala.AnyRef and, even more, scala.Null $\sqsubseteq$ scala.AnyRef $\sqcap$ scala.Null. Moreover, for any object of type T we have that scala.Null <: T. Hence, it doesn't make sense the behaviour that we can observe in the above example as the right branch of type Tree is enough general to cover the subtype relation existing with scala.Null type.

So, where is the problem?

So to speak, there is no error. But Scala handle the null value in a special way in a PM expression. So, even if null has a type relationship with any reference value, its type is not used. An additional interesting fact is that any type test (isInstanceOf[T]) on the null value fails (this is coherent as the relationship between the type of an object and scala.Null is not considered even if it virtually exists).

Pattern-matching blocks are partial functions

…and therefore are usable as such. The following code shows several examples: pattern-matching blocks are passed around, and even combined, using for instance the orElse function (which does exactly what you'd think it would with two partial functions).

( code could/should be simplified to get to the point..)

abstract class BinaryTree {
  def matchOnTrees[A](f: PartialFunction[BinaryTree,A]): Unit = {
    Console.println(if(f.isDefinedAt(this)) {
      val result = f(this)
      "Result of matching expression is: " + result
    } else {
      "No match was found."
    })
  }
 
  def megaTreeMatch[A](fs: List[PartialFunction[BinaryTree,A]]): Unit = fs match {
    case Nil => Console.println("Couldn't match it, sorry.")
    case f :: fs => if(f.isDefinedAt(this)) Console.println("Matched: " + f(this)) else megaTreeMatch(fs)
  }
 
  def megaTreeMatch2[A](fs: List[PartialFunction[BinaryTree,A]]): Unit = {
    val combinedFunction = fs.reduceLeft[PartialFunction[BinaryTree,A]]((f1,f2) => f1.orElse(f2))
    matchOnTrees(combinedFunction)
  }
}
 
case class Node(value: Int, left: BinaryTree, right: BinaryTree) extends BinaryTree
case object Leaf extends BinaryTree
 
object Test extends Application {
  val searchTree = Node(4, Node(2,Node(1,Leaf,Leaf),Node(3,Leaf,Leaf)), Node(5,Leaf,Leaf))
  val matcher1: PartialFunction[BinaryTree,String] = {
    case Node(0, Leaf, Leaf) => "Tree with only zero"
  }
 
  val matcher2: PartialFunction[BinaryTree,String] = {
    case Leaf => "Empty tree"
    case Node(3, _, _) => "Tree rooted at 3"
  }
 
  val matcher3: PartialFunction[BinaryTree,String] = {
    case Node(value, _, _) => "Tree (any) rooted at " + value
  }
 
  searchTree matchOnTrees {
    case Node(value, _, _) => value
    case Node(value, Leaf, Leaf) => value
  }
 
  searchTree matchOnTrees {
    case Leaf => "oops"
  }
 
  searchTree megaTreeMatch2 (matcher1 :: matcher2 :: matcher3 :: Nil)
}

Case null & completeness of PM

By inspecting the sources of liftweb - http://liftweb.net/ - (to our knowledge the biggest Scala application built for business purposes) it appears that matching on the null value is extensively used. Take the following example as a (trivial) use case.

object Null extends Application {
 
  def makeStatement(str: String): Option[String] = str match {
    case null => None
    case "bob" => Some("I do not like "+str)
    case v @ _ => Some("I do like "+v)
  }
 
  val name: String = null
  println(makeStatement(name) match {case Some(s) => s case None => "I'm confused..."})
}

matching on null allows to recover nicely from errors without overwhelm the code with try{}catch{} statements.

Method call in guards

The PM expression is obviously complete in respect of the definition of method getErrorMsg. Although, to correctly verify it we need some signature (e.g. postcondition) on the method…

0 match {
  case code @ _ if getErrorMsg(code) == "CRASH" => //expr1
  case code @ _ if getErrorMsg(code) == "OK"    => //expr2
}
 
def getErrorMsg(code: Int): String = {
  if(code == 0) "CRASH"
  else          "OK"
}

Case classes as Extractors

Case classes can be seen as a particular case of extractor definition. In fact, a case class can be mechanically translated in a pair class (that contains the data and operation of the class) and an object that define the constructor (apply) and extractor (unapply) method of the specific object. Let's take a simple example.

abstract class Term
case class Plus(l: Term, r: Term) extends Term {
  override def toString = "("+l +" + "+ r+")"
}
case class Number(n: Int) extends Term {
  override def toString = n+""
}

The above code define a class hierarchy using case class. We are interested in translating this definition in an equivalent using extractor. Here is how we could do this step:

abstract class Term
//class fields have to be accessible from anywhere (declared as val)
class Plus(val l: Term, val r: Term) extends Term {
  override def toString = "("+l +" + "+ r+")"
}
//extractor
object Plus {
  def apply(l: Term, r:Term) = new Plus(l,r)
  def unapply(p: Plus) = Some((p.l,p.r))
}
 
class Number(val n: Int) extends Term {
  override def toString = n+""
}
//extractor
object Number {
  def apply(n: Int) = new Number(n)
  def unapply(nu: Number) = Some(nu.n)
}

The two definitions are absolutely equivalent. In fact the following eval function would held the same result independently of the class representation used.

def eval(t: Term): Int = t match {
  case Plus(l,r) => eval(l) + eval(r)
  case Number(n) => n
}

Core Language Specification

$\begin{eqnarray*} prog~&~::=~&~constraints~^{0 \ldots n}ob^{0\ldots k}~cd^{0\ldots n}~e~&~\\ constraints~&::=~&~\textbf{/* constraints of }~C~\textbf{: }~fm~\textbf{*/}~&~\\ ob~&~::=~&~\textbf{object}~C~\{~extr~\}~&~\\ extr~&~::=~&~\textbf{de-f}~\textbf{unapply(}x~\textbf{:}~C\textbf{) :}~extT~\textbf{=}~e~&~\\ extT~&~::=~&~\textbf{Boolean}~|~\textbf{Option[(}D^{2\ldots n}\textbf{)]}~|~\textbf{Option[}~D~\textbf{]} \\ cd~&~::=~&~[\textbf{abstract}]~\textbf{class}~C\textbf{(}f^{0 \ldots n}\textbf{:}~C^{0 \ldots n}\textbf{)}~\textbf{extends}~D~\{~md^{1 \ldots k}~\}~&~\\ md~&~::=~&~\textbf{de-f}~\texttt{m}(x^{1\ldots n}:~C^{1 \ldots n})~:~C~=~e~&~\\ e~&~::=~&~\texttt{x}~|~e.\texttt{m}(e^{1 \ldots n})~|~(~e~:~C)~\textbf{match}~\{~c^{0\ldots n}~\}~|~e.\texttt{f}~\\ &&~\textbf{if~(}~e~\textbf{)}~\{~e~\}~\textbf{else}~\{~e~\}~|~\textbf{new}~C\textbf{(}e^{0\ldots n}\textbf{)}\\ &&~be~|~(~e~)~|~e~aop~e\\ p~&~::=~&~\texttt{x}~:~C~|~ C(p^{0\ldots n})~&~\\ c~&~::=~&~\textbf{case}~p~gd~\Rightarrow~e~&~\\ gd~&~::=~&~\textbf{if}~be~|~(~\texttt{empty}~)~&~\\ be~&~::=~&~e~bop~e~&~\\ aop~&~::=~&~\textbf{+}~|~\textbf{-}~|~\textbf{*}~|~\textbf{/}~|~\textbf{\%}~&~\\ bop~&::=~&~\textbf{!=}~|~\textbf{$<$}~|~\textbf{$<$=}~|~\textbf{==}~|~\textbf{\&\&}~|~\textbf{$||$}~& \end{eqnarray*}$

The production for a method defnition should start with def, not de-f. The error is due because def is a keyword in the wiki latex language and so it can't be used without protecting it (the problem is that I haven't found how to protect it).