IC Specifications

A specification is a concise description of the interface to a class or method. The difference between a spec and the usual explanatory doc is that the spec is complete: if fully specifies every action that the method takes in every possible situation, thereby eliminating all ambiguity as to what the method does.

There are several formal specification methods available which have the desired characteristics, except for one flaw: they're hard to use. To get a spec into actual use, it must be achievable by all programmers, not just the mathematically-oriented ones. Formal methods are way too intimidating.

The solution given here is combination of special notation with a mostly prose description. The result is far more readable than a formal notation, but has the same completeness characteristics.


Preconditions/Postconditions

The basis of a specification is the contract. This is a promise that the spec makes to the caller: if used in such and such a way, it'll deliver such and such a result. Guaranteed. This is stated in two parts: the preconditions and the postconditions.

Preconditions are the requirements that must be met when the routine is entered. They usually speak mainly in terms of the values of the parameters to the routine, although they may also refer to when the routine is allowed to be called and what values global variables used by the routines must have. Preconditions are conditions that the caller must guarantee.

Postconditions are the guarantees about any values returned, or actions taken, at the moment of exit from the routine. The implicit contract is always present: the postconditions are guaranteed only assuming the preconditions were true.

A spec is complete when the postconditions specify exactly what the routine does for every valid routine-entry scenario possible within the bounds of the preconditions.


Method Specs

The spec for a routine has this form:
    /**
     * The postconditions, phrased as what the routine does, as in "moves x
     * around a lot".  There may be several sentences, usually one for each
     * postcondition.
     * 

* Pre: the preconditions. Uses a similar format to the postconditions * except that it begins with "Pre:". */

This order might seem a bit backwards (since in a time sense the preconditions come before the postconditions), but this order makes specs easier to read since the postcondition is what casual readers are most interested in: what the routine does.

That covers the fundamentals of writing a spec. See the tutorial for elaboration on doing so. What follows here are some conventions and advanced notational features for special situations.

In postcondition statements it can be helpful to use the before/after notation. This is used to specify the value of a parameter or variable at the time of entry to the routine (written 'param and read "param before"), versus the value of the parameter or variable at the time of exit from the routine (written param' and read "param after"). For instance, a routine that increments i might write its postcondition i' = 'i+1.

Postconditions sometimes include a description of what happens during execution of the routine, rather than just the state of the variables at the end of the routine. For instance, it might say that the routine can block (i.e. pause for awhile before returning). Or it might say that the routine calls certain callbacks in a certain order while it executes. Such "during" conditions are required in the spec when they do something that impacts the caller's use of the routine. Normally "during" conditions are not needed.

Unless stated explicitly, values are assumed to be nonnull. So a routine that takes a String param s need not have the precondition, "s is nonnull".


Class Specs

A method cannot have a complete spec by itself; it must be speced in conjunction with the other methods of its class. This is because all the methods share internal variables, so each works with the others in changing the object state. Any of the class' method's postconditions rely upon the preconditions of all the class' methods being upheld.

The spec for a class has this form:

    /**
     * A one or two sentence informal description of the purpose of this
     * kind of object.  If the class has multiple sets of methods that are
     * used by different types of users, each is described as a separate
     * paragraph, as if the class were really several separate classes.
     * 

* MSS = the method sequence specification for the class, described * below. *

* Definitions: some terms that are used in several method specs can be * defined here once for common use. * *

Example:

* * A typical use example, an informal description of the object's methods * being used over a typical object lifetime. */
The only formal part of the class spec is the MSS; this specifies the order in which callers are allowed to call the class' methods. This may be a combination of the MSS notation and english, whichever works best. The MSS acts as an automatic precondition on all the methods in the class, every one of them must obey the class MSS in addition to their own preconditions. Sometimes it's helpful to supplement the MSS with preconditions for some of the methods, usually a combination of MSS and precondition is the most concise way to specify all the requirements.

Interfaces

Interfaces should be defined as carefully as classes are, but ICLs are only partially meaningful for interfaces since interfaces have no implementation to be checked for correctness. Interfaces act as a standalone specification, and all classes that implement them bear the burden of implementing the interface correctly.

This doesn't mean interfaces are ineffective for IC though. They provide the same sort of abstraction advantages that regular specs do, they just have no implementation of their own. Another helpful technique is to define multiple interfaces for a class that has fairly separable modes of use. When an instance of that class is passed as a parameter, the interface type can be given rather than the class type, which automatically docs the fact that the object is only to be used through that particular one of its interfaces.

Inheritance

Subclasses inherit all preconditions and postconditions from their superclasses. A subclass may have looser preconditions and tighter postconditions than the superclass, but not the other way around. That's the only way to guarantee the promises made by the superclass on behalf of all its subclasses. Similarly, implementors inherit conditions from their interfaces.

The fact that subclasses may have looser preconditions may seem questionable: the superclass has a precondition that says some fact X, and then a subclass overrides that and says X is not required after all. Is that really safe? Yes, as long as the subclass takes over all the functionality from the superclass; the subclass is responsible for guaranteeing correct results will be produced despite the less stringent precondition. If the subclass uses the superclass code though, it must make sure the superclasses requirements are fully met, so looser preconditions wouldn't be permitted.

The conditions listed in the superclass are not repeated in the doc for the subclass, so verifiers must check each superclass's conditions to get a complete list for the subclass. Superclasses may provide default phrasing for subclasses, as in "unless the subclass says otherwise, x is null". This means that the superclass does not guarantee this condition itself, but that many subclasses do so, just to save typing, they adopt the condition from the superclass unless it is explicitly overridden.


Method Sequence Specifications

Method Sequence Specifications (MSS) are a concise way of specifying method calling patterns. In class spec use, the MSS specifies all allowed calling sequences over the entire lifetime of an instance of this class; any unspecified sequences are not allowed. The same notation is also sometimes used for specifying local calling sequences in method specs.

The MSS notation is a regular expression of method names, just as grep uses regular expressions of characters. In regular expressions, x* means 0 or more x, x? mean 0 or 1 x, x+ means 1 or more x, and (x | y) means x or y. So the MSS (init run kill) means that the permitted sequence of calls to this object is first init(), then run(), then kill().

There are certain implicit rules used in MSS notation. First, every method mentioned occurs only where mentioned. That is, if x, y, and z are methods mentioned somewhere in an MSS, the reader can be guaranteed that they mustn't occur elsewhere within the MSS. Unmentioned methods are assumed allowed to be called at any time.

Second, MSSs sometimes mix calls into a method with calls from that method to others outside of the class. This technique is used when the author wants to show the relative sequence of actions made by the class in response to calls into it. There's no visual marker that this form of MSS is being used except by knowing which method names are incoming and which are outgoing.

Third, MSSs assume that any object constructor may be called before the MSS proper. Where it's necessary to talk about specific constructors, they may be given explicitly instead.

Although method names are usually given in MSS without parameters (because they don't matter for call-ordering purposes), sometimes they are needed. In such cases parens and arguments are given.

Specs for Multithreaded Code

MSS can be enhanced in multithreaded cases with the on-entry and on-exit notation, where ^x means "on entry to the method x()" and is usually read "x entry", and x^ means "on exit from the method x()" and is usually read "x exit". In this notation "x" alone is the same as "^x x^".

The advantage of entry-exit notation is that it specifies specific instants of time rather than durations, so it allows precise identification of calling orders within a method in multithreaded environments.

There is a fine point in the use of the name^ notation. That is that if one can't tell the difference between name^ and some slightly earlier state, one may still write name^. For instance, on an unsynchronized method name() the term name^ might be used to refer to the last object state-change inside the routine. The thread might actually still be in the routine (not having executed the return-from-method virtual-machine command yet), but no one can tell the difference. This probably seems useless now, don't worry: you'll recognize a need for this case when you see it. For synchronized methods such slop is not permitted: the moment of release of the monitor is clearly the exit moment.

Something to be aware of when working with multithreading code is the difference between recursion, which is multiple overlapped invocations of a routine from within a single thread, and reentrancy, which is multiple overlapped invocations of a routine from different threads. The difference is important because Java's synchronized keyword protects against reentrancy but not against recursion. One should clearly specify whether recursion or reentrancy are permitted in a method, if nothing is said it means the method allows both.


Instance Variable Specs

Instance variables are implicit inputs and outputs to every method in the object. Preconditions and postconditions must therefore take them into account. Usually it is more convenient to write such conditions as invariants once rather than repeating the same conditions with respect to the variable in several places.

Invariants are conditions that are both preconditions and postconditions on every method in a class. They are written as comments for the variables they affect, as for example:

    /** im1 = i-1. */
    private int im1 = 0;
    private int i = 1;
In this example, any method that modifies i must modify im1 too.


Intramethod Specs and "Macros"

When trying to reach high ICLs, it is necessary to split the code into very small pieces. But it can be inconvenient or can produce inefficient code to make separate subroutines for each piece. We need a way to take a small-to-moderate routine and break it into pieces for IC without having to make a bunch of tiny little subroutines.

One way is to use hierarchical comments. The 20 lines of code remain within a single routine, but comments are inserted that look like /*N */, where N is a number from 1 up. Such comments are said to be level-N comments. The idea is that all such comments that have a 1 are at the same level, and should be treated as the headers of subblocks that implement the greater purpose of the routine. All level-2 comments within a level-1 section are similarly headers for the implementation of that level-1 section.

Each level comment is a spec in itself. It specifies conditions for the subsequent block of code just as if it were a method of its own. Rather than having parameters, these code blocks use as inputs and outputs all the accessible variables of the routine.

This can go as many levels deep as needed. The trick is to make sure that each level of comments completely and precisely specifies the implementation of the level above, without requiring recourse to the level below to verify correctness. Each level should use preconditions and postconditions just as if it were a separate subroutine, and the code within the level shouldn't be examined by levels above: the level provides the same kind of abstraction that a method normally does.

This techinique gets out of hand with more than a couple levels, as it becomes hard to look at one level without succumbing to the urge to peer within it, since so much stuff must be skipped to get to the next correct examination point. The best solution would be preprocessor support so that one could name each subsection, and the code could then be written separately almost like a method, but would in fact be plugged in like a macro instead. But it's only a real problem for methods that are probably too big anyway.


Lexicons

There are some requirements that many pieces of code in the module or system use as a precondition. It would be tedious and error-prone and certainly not concise to repeat the full text of these everywhere they're used. The solution is the use of a lexicon; a document defining certain terms. By stating a single word or phrase, an entire multisentence condition can be adduced.

There can be many different lexicons: a global one, one for each subsystem, and perhaps one within a module for complicated modules. For instance, note that the class spec example had a definitions section; this is a small class-specific lexicon. For lexicons outside the file, an html href is used to allow readers to go directly to the lexicon when encountering an unfamiliar term.


Condition Shorthand Names

Method and macro conditions can have a lot of redundancy, as one method or statement will have as a precondition the same thing that another method or statement has as a postcondition. Instance variable invariants and class invariants can help with this.

Further redundancy reduction is possible using an abbreviation system. A shorthand name is given to each condition, and is used thenceforth within that file to refer to that condition. Condition names are usually a capital letter followed by a number. For instance, a precondition might be written as "x > 2 (C5)"; C5 is the condition name. The fact that the C5 is in parentheses means that this is its definition; later references to it will just be written C5 without parens.

Usually a new letter is used for conditions defined within a new class, just to make it easier to locate the condition definition.

Sometimes, one can mechanically check conditions by making sure that every postcondition of a routine is listed as a postcondition of one of the statements in that routine, and that every precondition within the routine is either a precondition to the whole routine, or a postcondition of a preceding statement. This latter fact means that an abbreviation has been defined for every precondition within a routine, another thing to check for.

Other times, abbreviations are given to conditions for the convenience of reviewers, even though that abbreviation may never be mentioned a second time.

 
 
Substantive changes:
    March 25, 1996: created.
Copyright © 1996, Steve Colwell, All Rights Reserved
Home page