A formal specification is a precise, unambiguous specification of the functionality of a software system.
Examples:
A formal specification is typically done at the ADT level. The behavior of each operation on the data type would be specified.
An outside observer sees only the interfaces of the operations, including a precise (but still abstract) description of the functionality provided by each interface. Thus, other system developers can use the operations provided by the ADT correctly without knowing anything about how the operations are implemented or about the structure of the data.
Given a formal specification of an ADT:
Formal specifications can be divided into two general categories:
Algebraic specifications capture the behavior of an ADT in terms of sequences of operations on instances of the type. Operations are divided into two categories:
The behavior of constructors is not explicitly specified in an algebraic specification. The behavior of each observer is captured by specifying what it does to an application of each constructor.
ADT Stack[Elem] /* generic stacks of some parameter type Elem */ interface: /* constructors */ new: -> Stack push: Elem * Stack -> Stack /* observers */ pop: Stack -> Stack top: Stack -> Elem isEmpty: Stack -> Boolean size: Stack -> int axioms: pop(new) == error pop(push(e, s)) == s top(new) == error top(push(e, s)) == e isEmpty(new) == true isEmpty(push(e, s)) == false size(new) == 0 size(push(e, s)) == 1 + size(s) end ADT Stack
Advantages of algebraic specification:
Disadvantages of algebraic specification:
Algebraic specifications are very similar to functional programs. This example Standard ML program for binary search trees is easily translated into an algebraic specification.
The Larch family of formal specification languages consists of one shared algebraic specification language (the Larch Shared Language), and many interface languages for different programming languages. Examples: Larch/C++ and Larch/Ada. This approach allows specifiers to use algebraic specifications for parts of a system, and a more programmer friendly notation for other parts. A LSL specification for stacks is presented next. An interface specification using this Shared Language specification is presented in the section on Pre- and Post-State.
A Larch Shared Language specification of trait (type) Stack:
Stack(Elem): trait includes cardinal, boolean introduces new: -> Stack push: Elem, Stack -> Stack pop: Stack -> Stack top: Stack -> Elem isEmpty: Stack -> Bool size: Stack -> Card asserts Stack generated by new, push Stack partitioned by pop, top, isEmpty Stack partitioned by pop, top, size forall e: Elem, s: Stack pop(push(e, s)) == s top(push(e, s)) == e isEmpty(new) == true isEmpty(push(e, s)) == false size(new) == 0 size(push(e, s)) == 1 + size(s)
Features of LSL traits:
includes
clause specifies what other traits are included
(textually substituted) in this trait. Traits cardinal
and
boolean
are standard traits defined in the LSL handbook.
introduces
clause gives the types of all operations
defined in this trait
asserts
clause gives the axioms that hold for the
operations defined in this trait
generated by
clause gives one subset of the
constructors defined in this trait that can generate all legal
instances of the trait
partitioned by
clause gives one subset of the
observers that can be used to distinguish between any two instances
of the trait
asserts
clause gives the axioms
describing the behavior of the operations
Example: A Larch Shared Language specification of trait
PriorityQueue. This specification
uses a "specification constructor" build
, which is not
part of the interface of the trait (not visible to code that uses the
specified trait). This constructor is used only to make writing the
specification easier. When using such a specification constructor,
the constructors that are visible to client code need to be defined
in terms of the specification constructor.
This specification also uses an implies
clause to express
a condition that is always true about priority queues. This is closely
related to the use of invariants in model-based
specification languages.
An interface specification using this Shared Language specification is presented in the section on Pre- and Post-State.
PriorityQueue(Elem): trait includes boolean, integer introduces new: -> PriorityQueue enqueue: Elem, Int, PriorityQueue -> PriorityQueue dequeue: PriorityQueue -> PriorityQueue head: PriorityQueue -> Elem isEmpty: PriorityQueue -> Bool asserts PriorityQueue generated by new, enqueue PriorityQueue partitioned by head, dequeue, isEmpty forall e, e2: Elem, i, j: Int, q: PriorityQueue enqueue(e, i, new) == build(e, i, new) enqueue(e, i, build(e2, j, q)) == if i > j then build(e2, j, enqueue(e, i, q)) else build(e, i, build(e2, j, q)) dequeue(build(e, q)) == if q = new then new else build(e, dequeue(q)) head(build(e, q)) == if q = new then e else head(q) isEmpty(new) == true isEmpty(build(e, i, q)) == false isValid(new) == true isValid(build(e, i, new)) == true isValid(build(e, i, build(e2, j, q))) == i <= j /\ isValid(build(e2, j, q)) implies forall q: PriorityQueue isValid(q)
In model-based specifications, the data in the system is modeled using formal types such as:
All model-based specifications use pre- and post-conditions written over values of some (or all) of these model types. We'll study the SPECS-C++ notation for writing these conditions. SPECS-C++ is a model-based specification language designed for specifying the interfaces of C++ classes. Other examples of model-based specification languages: VDM and Z.
declarations of tuple types:
type ordered_pair = tuple (int ord, int abs);
Constructor: ()
Example: (1, 2)
is an element of type ordered_pair
Observers: the name of each field of a tuple type can be used to extract that field from a value of the type
Examples (assume t
= (1, 2)
):
ord(t) = 1 abs(t) = 2
In SPECS-C++, C-style dot notation can also be used:
t.ord = 1 t.abs = 2
declarations of set types:
type int_set = set of int;
Constructors:
Standard set constructor: {}
Example: {1, 2, 3}
is an element of type int_set
Set comprehension
syntax: {
expression |
predicate
}
Example:
{x + 1 | x > 0 /\ x < 6 /\ x mod 2 = 1}which is equivalent to:
{2, 4, 6}
Other constructors: \union
, \intersect
, -
Observers: \in
(membership), \subset
,
| |
(size or cardinality)
Example: |{2, 4, 6}| = 3
declarations of sequence types:
type int_seq = sequence of int;
Constructors:
for building sequences: <_>
: elem -> sequence of elem
Example: <1, 2, 3>
is an element of type int_seq
appending sequences: _ || _
: sequence of elem * sequence of elem
-> sequence of elem
Example: <1, 2, 3> || <5> = <1, 2, 3, 5>
Observers:
let A = <a, b, c>
_[_]
: sequence of elem * int -> elem
A[2] = b
head(_)
: sequence of elem -> elem
head(A) = A[1] head(A) = ai.e. the first element of the argument sequence
tail(_)
: sequence of elem -> sequence of elem
tail(A) = <b, c>i.e. all of the argument sequence except the first element
last(_)
: sequence of elem -> elem
last(A) = ci.e. the last element of the argument sequence
front(_)
: sequence of elem -> sequence of elem
front(A) = <a, b>i.e. all of the argument sequence except the last element
length(_)
: sequence of elem -> int
length(A) = 3i.e. the length of the sequence
SPECS-C++ notation for logical operators:
/\
- logical and
\/
- logical or
=>
- logical implication
!
- logical negation
\forall
- universal quantification
\exists
- existential quantification
Operations (implementation functions and procedures) are specified using:
These conditions are written as first-order predicate logic assertions over the model types.
Example - a generic (non SPECS-C++) specification of class Stack:
ADT Stack[elem] /* generic stacks of some parameter type elem */ type Stack = sequence of elem; function new():Stack /* an omitted pre-condition defaults to true */ post: new() = <> function push(e:elem; s:Stack):Stack post: push(s) = <e> || s function pop(s: Stack):Stack pre: s != <> /* or length(s) > 0 */ post: pop(s) = tail(s) function top(s: Stack):elem pre: s != <> post: top(s) = head(s) function size(s: Stack):int post: size(s) = length(s) /* end of ADT Stack */
Model-based specifications can specify procedures (operations that modify
their parameters). When the value of a parameter (say v
) changes
from the pre-state (the system state before the procedure is called)
to the post-state (the system state after the procedure is called),
the pre-state value is decorated
with a ^
(i.e. v^
) and the post-state value is decorated
with a '
(i.e. v'
).
Example - A SPECS-C++ specification of class Stack:
template <class Elem> class Stack { /* model ** domains ** sequence of Elem stacktype ** data members ** stacktype theStack */ public: Stack(); /* modifies: self ** post: theStack' = <> */ void push(Elem e); /* modifies: self ** post: theStack' = <e> || theStack^ */ Elem pop(); /* pre: theStack^ != <> ** modifies: self ** post: result = head(theStack^) ** /\ theStack' = tail(theStack^) */ int size(); /* post: result = length(theStack^) */ int isEmpty(); /* post: (theStack^ = <> => result = 1) ** /\ (theStack^ != <> => result = 0) */ }
Note that the parts of this specification that aren't comments are legal C++ syntax:
When implementing this specification, there is no requirement that data members in the implementation be public, or that data members in the implementation correspond in any way to the data members in the specification.
Example - A (partial) SPECS-C++ specification of class PriorityQueue:
template <class Elem> class PriorityQueue { /* model ** domains ** // representation of queue elements ** // p is priority, e is content ** tuple (int p, ** Elem e) qelemtype ** // representation of priority queues ** sequence of qelemtype qtype ** data members ** qtype theQ ** invariant ** \forall int i [1 <= i < length(theQueue^) ** => theQueue^[i].p <= theQueue^[i + 1].p ** ] ** abstract functions ** define uptoP(int pp, qtype q) as qtype such that ** (q = <> => result = <>) ** /\ (q != <> => ** (head(q).p < pp => ** result = <head(q)> || uptoP(pp, tail(q))) ** /\ (head(q).p >= pp => result = <>)) ** define afterP(int pp, qtype q) as qtype such that ** (q = <> => result = <>) ** /\ (q != <> => ** (head(q).p < pp => result = afterP(pp, tail(q))) ** /\ (head(q).p >= pp => result = q)) */ public: PriorityQueue(); /* modifies: self ** post: theQ' = <> */ void enqueue(Elem pe, int pp); /* modifies: self ** post: theQ' = uptoP(pp, theQ^) || <(pp, pe)> ** || afterP(pp, theQ^) */ Elem dequeue(); /* pre: theQ^ != <> ** modifies: self ** post: theQ' = front(theQ^) /\ result = last(theQ^) */ int isEmpty(); /* post: (theQ^ = <> => result = 1) ** /\ (theQ^ != <> => result = 0) */ };
Example: A Larch/C++ specification of class Stack. This interface specification uses the model (data and operations) from LSL trait Stack.
template <class ElemType> class Stack { uses Stack(ElemType for Elem); public: Stack() { modifies self; ensures self' = new; } void push(ElemType e) { modifies self; ensures self' = push(e, self^); } ElemType pop() { requires self^ != new; modifies self; ensures result = top(self^) /\ self' = pop(self^); } int size() { ensures result = size(self^); } int isEmpty() { ensures result = toInt(isEmpty(self^)); } }
Note that this specification uses the data and functions defined in Stack, rather than data and functions from formal types built into the specification language. This is still a model-based specification. The difference is:
Example: a Larch/C++ specification for class priority queue. This specification uses the model defined in trait PriorityQueue.
template <class Elem> class PriorityQueue { uses PriorityQueue; public: PriorityQueue() { modifies self; ensures self' = new; } void enqueue(Elem e, int p) { modifies self; ensures self' = enqueue(e, p, self^); } Elem dequeue() { requires self^ != new; modifies self; ensures result = head(self^) /\ self' = dequeue(self^); } int isEmpty() { ensures result = toInt(isEmpty(self^)); } }
In either notation, an interface specification of a class or other ADT implementation must provide some means to test all pre-conditions. Client code does not have access to the representation of the type, and so can not check pre-conditions directly in general.
Exercise: check whether pre-conditions are testable in each model-based specification we've done.
More interesting specifications require quantified assertions.
An assertion of the form:
\forall
type variable [
assertion
]
is only true if assertion is true for every variable value of type type.
Example - a function that tests whether or not a sequence is sorted:
function sorted(s: sequence of real):boolean post: sorted(s) = \forall int i [ 1 <= i < length(s) => s[i] <= s[i + 1] ]
For values of i
that are legal indices of s
, the
comparison must be made. For values of i
that aren't legal
indices of s
, the implication is automatically true.
An assertion of the form:
\exists
type variable [
assertion
]
is true if assertion is true for at least one variable value of type type.
Example - a function that returns the maximum value in a sequence:
function max(s: sequence of real):real post: \exists int i [ 1 <= i <= length(s) /\ \forall int j [ 1 <= j <= length(s) => s[i] >= s[j] /\ max(s) = s[i] ]]
For values of i
that are not legal indices of s
, the
existentially quantified assertion is automatically false.
In some cases, not all values that can be represented by the type definitions are legal values of the ADT.
Example: a lookup table usually doesn't allow two entries to be associated with the same key value.
An invariant is a logical assertion that all legal values of the ADT must satisfy. The pre- and post-conditions of the operation specifications must ensure that all values produced using the operations satisfy the invariant.
template<class RangeType> class Table { /* model ** domains ** int DomainType ** tuple EntryType ( ** DomainType Index, ** RangeType Value) ** data members ** set of EntryType theTable ** invariant ** \forall (EntryType t) [ ** t \in theTable^ => ** \forall (EntryType s) [ ** s \in theTable^ /\ t != s => ** t.Index != s.Index ** ] ** ] */ public: Table(); /* modifies: self ** post: theTable' = {} */ void AddEntry(DomainType I, RangeType V, bool& AddedOk); /* modifies: self, AddedOk ** post: (\exists (EntryType e) [ ** e \in theTable^ /\ e.Index = I] ** /\ AddedOk' = false) ** \/ (!(\exists (EntryType e) [ ** e \in theTable^ /\ e.Index = I]) ** /\ theTable' = theTable^ \union {(I, V)} ** /\ AddedOk' = true) */ void RemoveEntry(DomainType I); /* modifies: self ** post: theTable' = theTable^ - {e | e \in theTable^ ** /\ Index(e) = I} */ void AccessTable(DomainType I, RangeType& V, bool& Defined); /* modifies: V, Defined ** post: (\exists (EntryType e) [ ** e \in theTable^ /\ e.Index = I /\ V' = e.Value] ** /\ Defined' = true) ** \/ (!(\exists (EntryType e) [ ** e \in theTable^ /\ e.Index = I]) ** /\ Defined' = false) */ void ReplaceValue(DomainType I, RangeType V, bool& ChangedOK); /* modifies: self, ChangedOK ** post: (\exists (EntryType e) [ ** e \in theTable^ /\ e.Index = I ** /\ theTable' = (theTable^ - {e}) \union {(I, V)}] ** /\ ChangedOk' = true) ** \/ (!(\exists (EntryType e) [ ** e \in theTable^ /\ e.Index = I]) ** /\ ChangedOk' = false) */ }
Discussion:
self
AddEntry
is the only interesting argument
Larch/C++ specifications typically do not use invariants, because invariant
conditions are expressed using an implies
or constrained by
clause in
the trait that the Larch/C++ specification is based on.
Cons:
Pros:
The future of formal specifications:
Why study formal specifications?