IC Specification Tutorial

This document covers the problems typically encountered while learning to write specs. Before reading this, make sure you know what a spec is.


Writing Specs

Be Concise

The trickiest part of spec-writing is in writing concise postconditions. The newcomer tends to use a natural-english description of what the class or method does; this can be many times larger than the code itself. Concomitant with this size is a lack of understandability. When it gets to the point that a reader can find an answer more easily by delving into a few levels of the code than by wading through the spec, the spec is not fulfilling its function.

Always strive for simpler conditions. A good condition is often only 4 or 5 words long. To achieve that sort of conciseness, one must ruthlessly eliminate busy words. The IC notation is designed to help you to eliminate many of them. A common error is to use words like "always" or "guaranteed" or "required" or "the caller must" in one's conditions. Usually these can be removed entirely since they are assumed for any condition. A bald precondition statement that "x is positive" means "the caller must make sure that x is required to be always positive". So save yourself the typing.


Inputs and Outputs

When beginning to write a spec for a method, it can help to think of the routine in terms of its inputs and outputs. The obvious inputs are the method parameters, and the obvious output is the method return value. But many routines have additional inputs: instance variables or global variables. And many routines have additional outputs: changed parameters, global variables, or system state changes (like whether an unusual amount of time passes during the method, or a file is created on the disk). It's not hard with practice to think of outputs in terms of every single change that a routine makes that anyone might notice. An input is any state information anywhere in the system that might change the method's behavior.

Naturally it is much easier to form the list of inputs and outputs for methods that have no side effects: global variables and system state. So avoid side effects in order to make spec writing easier.

Once you've made a list of a method's inputs and outputs, writing the spec is easy. The preconditions are the requirements that the inputs must satisfy on entry to the method, and the postconditions are the guarantees the method makes about the outputs. By checking off inputs and outputs and thinking about whether all conditions about them have been addressed, one can verify that one's spec is complete. Every possible form of the input that meets the preconditions must have a specified output in the postconditions.

For instance, suppose we have the following:

    /** Remove all whitespace before the first nonwhitespace in s. */
    String trim(String s)
    {
    }
This spec seems sort of clear, but let's think about it in terms of the input cases. The only input is s. Let's make a precondition that s be nonnull. The valid input cases are: s may have leading whitespace or not, s may be all whitespace, s may be empty. Now we check these cases against the spec above. They're all fine except the "all whitespace" case: the spec seems ambiguous on this point. Should the whitespace be removed or not? Common sense tells us that probably it should be, but the spec certainly doesn't say that. The alternate phrasing "remove all leading whitespace" has the same problem.

A better postcondition phrasing is "Returns the largest substring of s that has no leading whitespace". Often this sort of semi-mathematical phrasing makes for good postconditions, because it states the characteristics of the result rather than the method's implementation. In comparing this new spec against the various input cases we see that it clearly specifies what the method does in each case.

The thing to watch out for is implicit assumptions. You have to think about assumptions being made at every stage. Do you really have all the inputs and outputs listed? Do you really have every input case listed? Have you really thought about every parameter? It's very easy to slide right past an input case or a parameter because it's so obvious what it does, resulting in incomplete consideration of all the input cases. If this happens to use, try using a checklist approach, checking off each input and output as it is considered to make sure you skip no steps.


Conditions Should Specify State, Not Implementation

The most common error is to write a postcondition that parallels the code, rather than one that says what the method has accomplished on exit. Think of the postcondition as describing the state of the object at the moment of exit, rather than as a description of what the method does to reach that state.

For instance, here are two specs for a method:

    /**
     * If x is in the list l, move it to the start of the list.  If it isn't
     * in the list l, add it to the start of the list.
     */
    myFunc() ...
 
    /** Put x at the start of l. */
    myFunc() ...
In writing the second spec, I've assumed that there is an invariant on l stating that any x may occur only once within it. The latter spec is just as precise as the former, and because it's stated as a condition rather than a recipe, it's easier to check things against it for correctness.

Here's how to get a condition form instead of an implementation one. Try to think what describes x at the moment of exit from the routine. The first thing that comes to mind is that x will be at the start of the list. That is a simple fact that will definitely be true. To make sure all cases are covered we must make it clear that x may not be in two places in l at once; in this case, l's invariant is a good place for storing this fact. So we now have "Post: x is at the start of l".

The final step is to turn this into an action description rather than a state-at-exit. This is an optional step; you may well prefer the clearer state-at-exit form. But for naive code readers, such conditions can be confusing because they use a different tense than typical function descriptions. This is easily remedied by adjusting the phrasing to the action-oriented style "Put x at the start of l"; we've just added a thin veneer of tense change to make it more palatable to IC novices while preserving the underlying state-based characteristics.


Implementation-think is Bad For Variables Too

Sometimes one thinks carefully about methods and writes good nonimplementiony specs for them, but doesn't give the same care to variables. Each instance variable or parameter to a method must be considered in the light of what functions it provides, rather than what it does. For instance, suppose one has an array. Beginners are tempted to make a precondition saying, "it's an array" or maybe, "it's a nonempty array". Beyond that they're stuck.

The arrayness of the parameter is its implementation. For a spec, you need the meaning behind the implementation. Is it today's date as an array of characters? Is it a buffer for receiving input? Is it a set of unsorted numbers? Our programming languages make strong typing seem like the be-all and end-all of correct coding, but IC goes far beyond that. A compiler can determine what type a variable has; it takes your brain to say what the variable means. Only if you step beyond what the compiler does can you gain the advantages of IC.


Use the Caller's Point of View

One can sometimes tell that a spec is too implementationy, but can't figure out how to write it differently. This often happens as a result of writing the spec after the code has been written. You look at the code to see what it does, write a little more spec, look at the code again, and before you know it you've got an implementationy spec.

The solution is to not look at the code. Cover it with your hand if you have to. Instead, think about what the routine does from the caller's point of view. The caller couldn't care less about those implementation details, so this technique often immediately results in something close to a correct spec. For instance, it might give, "well, it moves x around in the list". Work with this first phrase, fixing it up until you end up with a correct spec like "Put x at the start of the list".

Routines That Seem Easy Can Be Hard To Spec, Too

It's just as important to use the caller's point of view when specing a simple routine as a complex one. Take for instance the following routine:
    /** Returns whether c is a special character. */
    boolean isSpecial(char c)
    {
	if (c == '[' || c == ']' || c == '(' || c == ')' ||
	    c == '{' || c == '}' || c == '&' || c == '.' ||
	    c == '+' || c == '"' || c == '#' || c == '\\' ||
	    c == '\'')
	{
	    return true;
	}
	return false;
    }
You'd think this would be super easy to spec, and in fact it seems like a fine spec at first glance: assuming of course that the term 'special character' is clearly defined elsewhere in the class.

However, this just underscores the point that you can't write a spec based on its code; what's important is what the caller is expecting of it. In this case, examination of the class showed that a special character is actually was defined as being one of [ ] ( ) { } && . or +. That is, the last four characters this routine is checking for were not in the definition! Examination of the caller revealed why; from the caller's point of view the spec should have been, "Returns whether c is a special character, or one of " # \ '." The caller wanted this special behavior on the last four characters for its own reasons.

Another way to say this is that you should think of the function of a routine in terms of what it is rather than what it does.


Compare and Contrast

Sometimes a first attempt at a spec is too vague and general. That spec could fit 100 different methods. The question to ask yourself is what this method does that's different than what others that fit that spec do. It's often particularly easy to ask, since there is often such a similar method nearby in the same module or subsystem.

Suppose you have a function for getting the next character from a stream. Internally, this function first checks to see whether the unget function has been used, in which case it returns that character. Otherwise, it calls a routine that really gets the next character from the stream. Your first shot at a spec for each of these functions might be the same, "gets the next character from the stream". Obviously something needs to be done because they don't do exactly the same thing yet they have the same spec.

First, don't throw out the specs! These specs are very close to being perfect, they just need a little tweak. To figure out what to tweak, compare the methods' functionalities against each other. In this case, they differ in what the "stream" is that each refers to. The outer function gets a character from the encompassing BackableStream, while the inner function gets the character from a DataInputStream. Change the word 'stream' in each spec to the more specific form and you're done. In fact, if the meaning of stream in each spec is already unambiguous in its context, no change at all may be needed.

The point is, you can add just the necessary degree of specificity to an overly general spec by comparing its method against others like it to see what its distinguishing markings are. Such comparisons help to focus you on the important function of the method.


Change the Code, Not Just the Spec

Sometimes (maybe even usually) a method will seem just too hard to spec. You try and try but the spec keeps coming out wordy and complicated. This is probably because the method is itself complicated; there's no easy way to spec it. If the mountain won't come to Mohammed, Mohammed has to go to the mountain. Instead of trying to spec a messy routine, clean up the routine to match a simple spec.

One way to think of this sort of situation is that you are going to have to spec the method. Given that, what's the easiest way to get it done? If the easiest way is to rewrite the code, so be it. Do what's easiest.

It's easy for a moderately-sized methods to be fundamentally unspecable; they often refer to or change too many variables, and they do so in such a complex way, that there's no more concise spec than the code itself.

The rework of such a method must go in two directions: toward a more simple functionality, and towards a simpler implementation of that functionality. The first step is to change what the routine does until it can be described with only a few conditions. This is the idea behind modular design: to improve the cohesion and coupling of the program.

A simpler functionality is achieved by loosening coupling. Loosely coupled routines have few parameters and no side effects, so that there are few points of contact between the caller and the implementor. That way the few points of contact can be described in few lines of spec.

A simpler implementation is achieved through tighter cohesion; this means that one should find the cut-points within a routine that break it into separate loosely-coupled routines. It's like diamond cutting; you fracture the gem where it is easiest. The pieces remaining are cohesive because they are tightly-coupled internally.


Use Custom Data Types

Another way of simplifying a method is to identify cohesiveness among its variables and to make new data types by grouping them. Rather than splitting up the method into chunks of lines, you pull out common nuggets of code that manipulate groups of variables.

For instance, rather than having two variables, an array and an index into it, one could use a Stack or Queue object. Everywhere that the method used to adjust the array and index in lockstep, it can now do a single operation on the new data type. This simplifies the module from the inside out rather than from the outside in as we did above.

Often a group of instance variables works together to provide a class-specific internal capability, and it doesn't seem like it makes sense to make a whole new abstract data type class just for this one use. Newcomers to IC should still make a separate class for such one-use data types, because the process of separating out instance variables into a standalone class helps greatly in thinking in terms of proper specs. Advanced programmers can relax this somewhat, but for groupings of more than a couple instance variables it probably still makes sense to define a new class data type.

When there are only two variables working together, and the programmer has a good feel for IC concepts, it is possible to leave the variables within the encompassing class but to document them as if they were a separate class. All uses of the variables are then treated like method calls on the "ghost" data type, without the syntactic overhead of making a new class. Still, even advanced programmers should strongly consider going ahead and making a new class in such a situation, as it makes for the most readable code in most cases.


Write Correct Code, Not Robust Code

In simplifying a class design, one can take advantage of a strength of preconditions: they put some of the burden for correct use on the caller. We've all gotten into the habit of making code robust by trying to handle various not-quite-correct inputs in a reasonable way. Unfortunately, this sort of robustness can make the code and spec complicated, as every possible error condition is robustly turned to some useful result.

A better approach is to restrict the form of the inputs to the class, through preconditions. More cases that used to be handled in a halfway manner can then be discarded (through assertions or just reliance on proper fullfillment of the preconditions), greatly simplifying the module.


The Snowball Effect

All these techniques for simplifying a method work synergistically; as each is applied, the code gets a little bit simpler, enabling one to understand the code a little better and to therefore see another simplification. It's like a snowball rolling downhill; it gets bigger and bigger the more to work at it. A routine might seem impenetrable at first, but if you just keep applying one small gem fracture or custom data type or redefinition of requirements after another, they'll soon get easier and the next thing you know you'll have properly small and understandable methods.

Well speced code often seems too simple; a sometimes-heard complaint is, "all the examples I've seen of speced code are fine, but they're too simple; there's nothing to show how to spec real code like mine!" These people just haven't realized yet that it's no accident all speced code is simple; it was rewritten that way so that it could be specable! "Real" code of the type intimated will have to be simplified first and speced second.


It Should Click

You should feel a mental click when you read a good spec. The click tells you that it all locks together; it gives you the feeling that you know what this routine is supposed to do, in every case. There are no hidden gotchas. The preconditions cover all possible inputs; the postconditions list all outputs. Properly speced code works like a mathematical equation: the preconditions plus the implementation plus the specs of any adduced modules lead inexorably to the postconditions.

Note the mention of "adduced modules". This is an important point: the routine is responsible for the actions of any routines it uses internally. That is, the reader of your spec doesn't need to know whether your implementation is one big routine, or a bunch of calls to outside routines: the reader just wants to know that your spec covers everything that happens between entry to and exit from the routine.

 
 
Substantive changes:
    March 25, 1996: created.
Copyright © 1996, Steve Colwell, All Rights Reserved
Home page