Immutable Abstract Data Types

A useful approach in constructing software is to hide the implementation of key complex types. Abstract Data Types (ADT) are designed to do that.

An ADT introduces a type name and operations on this type, but does not reveal how the type is implemented. Outside the module implementing ADT it is possible to manipulate it only using provided operations.

Example Abstract Data Type

Consider an abstract data type called 'natSet', meant to denote sets of natural numbers and with operations empty,insert,remove,union,member:

abstype natSet  
empty  :: natSet
insert :: nat -> natSet -> natSet
remove :: nat -> natSet -> natSet
union :: natSet -> natSet -> natSet
member :: nat -> natSet -> bool

Note that natSet defines immutable values, that is, these are mathematical values and not mutable cells storing values.

Multiple implementations are possible:

  • sorted list of natural numbers
  • binary search tree
  • first-class functions (closures, behaves as unsorted lists)

A program can use natSet to define, for example, operations on binary representations of arbitrarily large integers. Because such program does not know the concrete implementation of natSet, we can change the implementation and the program can still compile and will be type safe.

Type safety is crucial for this to work: if we could cast another value into natSet using functions other than those declared, changing implementation could break.

ADT Value Construction Consequence

In the sequel, the following consequence of type safety and abstraction will be relevant for us:

Value construction consequence: Let $v$ be a value of an immutable ADT computed by a (potentially mutable) function in a type safe language. Then there exists a finite expression tree whose nodes are operations of ADT and whose leaves are either nullary operations of ADT or non-ADT values, such that this expression tree evaluates to value $v$.

Note that this consequence does not rely on any properties of the function computing $v$. The function could be arbitrarily complex, and need not terminate for all inputs, as long as we consider an evaluation that computed $v$. The expression tree is the part of the computation that is relevant for computing $v$ (for non-ADT values, it need not keep track how they are computed).

In the previous example, we are sure that all natSet values computed in any code that uses the ADT are obtained by starting from empty and applying operations insert,remove,union.