Formal Specification

(read 9.3, 11, 10, remainder of ch. 9)

A formal specification is a precise, unambiguous specification of the functionality of a software system.

Examples:

A formal specification is typically done at the ADT level. The behavior of each operation on the data type would be specified.

An outside observer sees only the interfaces of the operations, including a precise (but still abstract) description of the functionality provided by each interface. Thus, other system developers can use the operations provided by the ADT correctly without knowing anything about how the operations are implemented or about the structure of the data.

Given a formal specification of an ADT:

Formal specifications can be divided into two general categories:

Algebraic Specifications

Algebraic specifications capture the behavior of an ADT in terms of sequences of operations on instances of the type. Operations are divided into two categories:

The behavior of constructors is not explicitly specified in an algebraic specification. The behavior of each observer is captured by specifying what it does to an application of each constructor.

ADT Stack[Elem]
/* generic stacks of some parameter type Elem */

interface:

 /* constructors */
  new: -> Stack
  push: Elem * Stack -> Stack

 /* observers */
  pop: Stack -> Stack
  top: Stack -> Elem
  isEmpty: Stack -> Boolean
  size: Stack -> int

axioms:

  pop(new) == error
  pop(push(e, s)) == s

  top(new) == error
  top(push(e, s)) == e

  isEmpty(new) == true
  isEmpty(push(e, s)) == false

  size(new) == 0
  size(push(e, s)) == 1 + size(s)
end ADT Stack

Advantages of algebraic specification:

Disadvantages of algebraic specification:

Algebraic specifications are very similar to functional programs. This example Standard ML program for binary search trees is easily translated into an algebraic specification.

The Larch family of formal specification languages consists of one shared algebraic specification language (the Larch Shared Language), and many interface languages for different programming languages. Examples: Larch/C++ and Larch/Ada. This approach allows specifiers to use algebraic specifications for parts of a system, and a more programmer friendly notation for other parts. A LSL specification for stacks is presented next. An interface specification using this Shared Language specification is presented in the section on Pre- and Post-State.

A Larch Shared Language specification of trait (type) Stack:

Stack(Elem): trait
  includes cardinal, boolean
  introduces
    new: -> Stack
    push: Elem, Stack -> Stack
    pop: Stack -> Stack
    top: Stack -> Elem
    isEmpty: Stack -> Bool
    size: Stack -> Card
  asserts
    Stack generated by new, push
    Stack partitioned by pop, top, isEmpty
    Stack partitioned by pop, top, size
    forall e: Elem, s: Stack
      pop(push(e, s)) == s
      top(push(e, s)) == e
      isEmpty(new) == true
      isEmpty(push(e, s)) == false
      size(new) == 0
      size(push(e, s)) == 1 + size(s)

Features of LSL traits:

Example: A Larch Shared Language specification of trait PriorityQueue. This specification uses a "specification constructor" build, which is not part of the interface of the trait (not visible to code that uses the specified trait). This constructor is used only to make writing the specification easier. When using such a specification constructor, the constructors that are visible to client code need to be defined in terms of the specification constructor.

This specification also uses an implies clause to express a condition that is always true about priority queues. This is closely related to the use of invariants in model-based specification languages.

An interface specification using this Shared Language specification is presented in the section on Pre- and Post-State.

PriorityQueue(Elem): trait
  includes boolean, integer
  introduces
    new: -> PriorityQueue
    enqueue: Elem, Int, PriorityQueue -> PriorityQueue
    dequeue: PriorityQueue -> PriorityQueue
    head: PriorityQueue -> Elem
    isEmpty: PriorityQueue -> Bool
  asserts
    PriorityQueue generated by new, enqueue
    PriorityQueue partitioned by head, dequeue, isEmpty
    forall e, e2: Elem, i, j: Int, q: PriorityQueue
      enqueue(e, i, new) == build(e, i, new)
      enqueue(e, i, build(e2, j, q)) ==
        if i > j then build(e2, j, enqueue(e, i, q))
        else build(e, i, build(e2, j, q))
      dequeue(build(e, q)) == 
        if q = new then new
        else build(e, dequeue(q))
      head(build(e, q)) == 
         if q = new then e
         else head(q)
      isEmpty(new) == true
      isEmpty(build(e, i, q)) == false
      isValid(new) == true
      isValid(build(e, i, new)) == true
      isValid(build(e, i, build(e2, j, q))) ==
        i <= j /\ isValid(build(e2, j, q))
  implies 
      forall q: PriorityQueue isValid(q)

Model-Based Specifications

In model-based specifications, the data in the system is modeled using formal types such as:

All model-based specifications use pre- and post-conditions written over values of some (or all) of these model types. We'll study the SPECS-C++ notation for writing these conditions. SPECS-C++ is a model-based specification language designed for specifying the interfaces of C++ classes. Other examples of model-based specification languages: VDM and Z.

Tuples

declarations of tuple types:

type ordered_pair = tuple (int ord,
                           int abs);

Constructor: ()

Example: (1, 2) is an element of type ordered_pair

Observers: the name of each field of a tuple type can be used to extract that field from a value of the type

Examples (assume t = (1, 2)):

ord(t) = 1
abs(t) = 2

In SPECS-C++, C-style dot notation can also be used:

t.ord = 1
t.abs = 2

Sets

declarations of set types:

type int_set = set of int;

Constructors:

Standard set constructor: {}

Example: {1, 2, 3} is an element of type int_set

Set comprehension

syntax: { expression | predicate }

Example:

{x + 1 | x > 0 /\ x < 6 /\ x mod 2 = 1}
which is equivalent to:
{2, 4, 6}

Other constructors: \union, \intersect, -

Observers: \in (membership), \subset, | | (size or cardinality)

Example: |{2, 4, 6}| = 3

Sequences

declarations of sequence types:

type int_seq = sequence of int;

Constructors:

for building sequences: <_>: elem -> sequence of elem

Example: <1, 2, 3> is an element of type int_seq

appending sequences: _ || _: sequence of elem * sequence of elem -> sequence of elem

Example: <1, 2, 3> || <5> = <1, 2, 3, 5>

Observers:

let A = <a, b, c>

SPECS-C++ notation for logical operators:

Pre- and Post-conditions

Operations (implementation functions and procedures) are specified using:

These conditions are written as first-order predicate logic assertions over the model types.

Example - a generic (non SPECS-C++) specification of class Stack:

ADT Stack[elem]
/* generic stacks of some parameter type elem */

type Stack = sequence of elem;

function new():Stack
/* an omitted pre-condition defaults to true */
post: new() = <>

function push(e:elem; s:Stack):Stack
post: push(s) = <e> || s

function pop(s: Stack):Stack
pre: s != <>
/* or length(s) > 0 */
post: pop(s) = tail(s)

function top(s: Stack):elem
pre: s != <>
post: top(s) = head(s)

function size(s: Stack):int
post: size(s) = length(s)

/* end of ADT Stack */

Pre- and Post-State

Model-based specifications can specify procedures (operations that modify their parameters). When the value of a parameter (say v) changes from the pre-state (the system state before the procedure is called) to the post-state (the system state after the procedure is called), the pre-state value is decorated with a ^ (i.e. v^) and the post-state value is decorated with a ' (i.e. v').

Example - A SPECS-C++ specification of class Stack:

template <class Elem> class Stack {
/* model
**   domains
**     sequence of Elem stacktype
**   data members
**     stacktype theStack
*/

public:

Stack();
/* modifies: self
** post: theStack' = <>
*/

void push(Elem e);
/* modifies: self
** post: theStack' = <e> || theStack^
*/

Elem pop();
/* pre: theStack^ != <>
** modifies: self
** post: result = head(theStack^)
**    /\ theStack' = tail(theStack^)
*/

int size();
/* post: result = length(theStack^)
*/

int isEmpty();
/* post: (theStack^ = <> => result = 1)
**    /\ (theStack^ != <> => result = 0)
*/
}

Note that the parts of this specification that aren't comments are legal C++ syntax:

When implementing this specification, there is no requirement that data members in the implementation be public, or that data members in the implementation correspond in any way to the data members in the specification.

Example - A (partial) SPECS-C++ specification of class PriorityQueue:

template <class Elem> class PriorityQueue {
/* model
**   domains
**     // representation of queue elements
**     //  p is priority, e is content
**     tuple (int p,   
**            Elem e) qelemtype
**     // representation of priority queues
**     sequence of qelemtype qtype
**    data members
**      qtype theQ
**    invariant
**      \forall int i [1 <= i < length(theQueue^)
**        => theQueue^[i].p <= theQueue^[i + 1].p 
**      ]
**    abstract functions
**      define uptoP(int pp, qtype q) as qtype such that
**           (q = <> => result = <>)
**        /\ (q != <> =>
**             (head(q).p < pp =>
**               result = <head(q)> || uptoP(pp, tail(q)))
**          /\ (head(q).p >= pp => result = <>))
**      define afterP(int pp, qtype q) as qtype such that
**           (q = <> => result = <>)
**        /\ (q != <> =>
**             (head(q).p < pp => result = afterP(pp, tail(q)))
**          /\ (head(q).p >= pp => result = q))
*/
public:

PriorityQueue();
/* modifies: self
** post: theQ' = <>
*/

void enqueue(Elem pe, int pp);
/* modifies: self
** post: theQ' = uptoP(pp, theQ^) || <(pp, pe)>
**               || afterP(pp, theQ^)
*/

Elem dequeue();
/* pre: theQ^ != <>
** modifies: self
** post: theQ' = front(theQ^) /\ result = last(theQ^)
*/

int isEmpty();
/* post: (theQ^ = <> => result = 1)
**    /\ (theQ^ != <> => result = 0)
*/
};

Example: A Larch/C++ specification of class Stack. This interface specification uses the model (data and operations) from LSL trait Stack.

template <class ElemType> class Stack {
  uses Stack(ElemType for Elem);
public:

Stack() {
  modifies self;
  ensures self' = new;
}

void push(ElemType e) {
  modifies self;
  ensures self' = push(e, self^);
}

ElemType pop() {
  requires self^ != new;
  modifies self;
  ensures result = top(self^)
       /\ self' = pop(self^);
}

int size() {
  ensures result = size(self^);
}

int isEmpty() {
  ensures result = toInt(isEmpty(self^));
}
}

Note that this specification uses the data and functions defined in Stack, rather than data and functions from formal types built into the specification language. This is still a model-based specification. The difference is:

Example: a Larch/C++ specification for class priority queue. This specification uses the model defined in trait PriorityQueue.

template <class Elem> class PriorityQueue {
  uses PriorityQueue;
public:

PriorityQueue() {
  modifies self;
  ensures self' = new;
}

void enqueue(Elem e, int p) {
  modifies self;
  ensures self' = enqueue(e, p, self^);
}

Elem dequeue() {
  requires self^ != new;
  modifies self;
  ensures result = head(self^)
       /\ self' = dequeue(self^);
}

int isEmpty() {
  ensures result = toInt(isEmpty(self^));
}
}

In either notation, an interface specification of a class or other ADT implementation must provide some means to test all pre-conditions. Client code does not have access to the representation of the type, and so can not check pre-conditions directly in general.

Exercise: check whether pre-conditions are testable in each model-based specification we've done.

More interesting specifications require quantified assertions.

Universal Quantification

An assertion of the form:

\forall type variable [ assertion ]

is only true if assertion is true for every variable value of type type.

Example - a function that tests whether or not a sequence is sorted:

function sorted(s: sequence of real):boolean
post: sorted(s) = \forall int i [ 1 <= i < length(s) =>
       s[i] <= s[i + 1] ]

For values of i that are legal indices of s, the comparison must be made. For values of i that aren't legal indices of s, the implication is automatically true.

Existential Quantification

An assertion of the form:

\exists type variable [ assertion ]

is true if assertion is true for at least one variable value of type type.

Example - a function that returns the maximum value in a sequence:

function max(s: sequence of real):real
post: \exists int i [ 1 <= i <= length(s) /\
         \forall int j [ 1 <= j <= length(s) =>
            s[i] >= s[j] /\ max(s) = s[i] ]]

For values of i that are not legal indices of s, the existentially quantified assertion is automatically false.

Invariants

In some cases, not all values that can be represented by the type definitions are legal values of the ADT.

Example: a lookup table usually doesn't allow two entries to be associated with the same key value.

An invariant is a logical assertion that all legal values of the ADT must satisfy. The pre- and post-conditions of the operation specifications must ensure that all values produced using the operations satisfy the invariant.

template<class RangeType> class Table {
/* model
**   domains
**     int DomainType
**     tuple EntryType (
**             DomainType Index,
**             RangeType Value)
**   data members
**     set of EntryType theTable
**   invariant
**     \forall (EntryType t) [
**       t \in theTable^ => 
**         \forall (EntryType s) [
**           s \in theTable^ /\ t != s =>
**             t.Index != s.Index
**         ]
**     ]
*/
public:
Table();
/* modifies: self
** post: theTable' = {}
*/

void AddEntry(DomainType I, RangeType V, bool& AddedOk);
/* modifies: self, AddedOk
** post: (\exists (EntryType e) [
**           e \in theTable^ /\ e.Index = I]
**         /\ AddedOk' = false)
**   \/ (!(\exists (EntryType e) [
**           e \in theTable^ /\ e.Index = I])
**       /\ theTable' = theTable^ \union {(I, V)}
**       /\ AddedOk' = true)
*/

void RemoveEntry(DomainType I);
/* modifies: self
** post: theTable' = theTable^ - {e | e \in theTable^
**                                   /\ Index(e) = I}
*/

void AccessTable(DomainType I, RangeType& V, bool& Defined);
/* modifies: V, Defined
** post: (\exists (EntryType e) [
**          e \in theTable^ /\ e.Index = I /\ V' = e.Value]
**          /\ Defined' = true)
**    \/ (!(\exists (EntryType e) [
**            e \in theTable^ /\ e.Index = I])
**        /\ Defined' = false)
*/

void ReplaceValue(DomainType I, RangeType V, bool& ChangedOK);
/* modifies: self, ChangedOK
** post: (\exists (EntryType e) [
**           e \in theTable^ /\ e.Index = I 
**           /\ theTable' = (theTable^ - {e}) \union {(I, V)}]
**           /\ ChangedOk' = true)
**   \/ (!(\exists (EntryType e) [
**           e \in theTable^ /\ e.Index = I])
**       /\ ChangedOk' = false)
*/
}

Discussion:

Larch/C++ specifications typically do not use invariants, because invariant conditions are expressed using an implies or constrained by clause in the trait that the Larch/C++ specification is based on.

Evaluating Formal Specifications

Cons:

Pros:

The future of formal specifications:

Why study formal specifications?