Using Signatures to Improve Smalltalk Productivity and Reuse

 
Steve Burbeck
IBM Research
sburbeck@us.ibm.com
 
 Revision History:
 First Review Draft - 1/23/95
 Second Review Draft - 2/26/95
 Third Draft - 6/22/95
 HTML version - 10/29/97
 
 

Non Confidential Document

 
TABLE OF CONTENTS

1. OVERVIEW 2. A KEY PROBLEM: MISSING DESIGN 3. METHOD SIGNATURES 4. QUALIFIERS 5. SIGNATURE EXAMPLES 6. FOCUS ON DESIGN INTENT

7. NEXT STEPS

8. REFERENCES

9. APPENDIX: QUICK REFERENCE TO SIGNATURES


1.  Overview

1.1  A Blueprint for Change

Change is coming to the way we develop object oriented systems.  Present-day development of object systems relies on a craft culture: each component is hand crafted by highly skilled programmers to fit the unique needs of each project.  This craft culture is already hard pressed to meet the needs of current projects.  There are too few skilled object technology artisans to meet the growing demand.  Even when enough skilled artisans can be found, a craft culture offers few opportunities for economies of scale.  Bigger projects simply require more artisans; the larger the staff, the higher the management overhead, the lower the productivity of the technical staff, and the greater the risks due to lack of effective communication.  Moreover, a craft culture tends to resist reuse; a highly skilled artisan would usually rather build a custom object than reuse one that doesn’t quite fit.   In doing so, the professional skills of the artisan improve but accumulation of reuseful and reusable software components is slow at best.  Large scale use and reuse of object technology requires a new culture.

As with all cultural transitions, the transition to a reuse culture will take some time and will progress in fits and starts.  However the transition is likely to pass through three phases.

Most current OO projects are in the first phase.  Each phase will have its unique challenges.  The challenges in phase one are to improve the productivity of the artisans and to pave the way for harvesting.  We must grow the skills base, provide processes and tools that maximize the productivity of OO developers, and maximize the harvestability of the one-off components developed in phase one projects.  The sooner the necessary skills base, practices, and tools are in place to repeatably complete successful OO projects, the sooner an organization can enter phase two.  Therefore one important mission of the North American Object Foundry is to institute the practices and provide the tools that most quickly and effectively address these phase one issues.

1.2  A First Step

A reuse culture cannot be based upon the reuse of code alone -- reuse requires an understanding of fine grain design decisions upon which the code is based.  This paper describes a processes for capturing an important component of that design information.  In brief, we propose that developers add a formatted comment, called a signature comment, into the text of each method.  The signature characterizes the object that is returned from the method and the objects intended as arguments, if any.  These object type descriptors, called qualifiers, provide information to the reader of the method that otherwise must be deduced by reading the code and/or browsing the clients of the code.  This added information improves communication within a project team and helps those who later seek to harvest and reuse the documented components.  It thereby addresses all three of the key phase one challenges. We recommend that this process be taught to beginning Smalltalk programmers and be adopted as standard practice in all Smalltalk projects.

2.  A Key Problem: Missing Design

Object-oriented design involves just a handful of basic elements.  One of the most important for understanding, harvesting and reusing an OO system is collaboration -- the client-server relationship between objects.  OO designers document collaborations with a variety of techniques (examples include CRC cards, message flow diagrams, object interaction diagrams, and scenario transcripts).  These approaches differ in the level of granularity at which they describe collaboration.  Some techniques, CRC for instance, deal only with class-to-class or subsystem-to-subsystem collaborations.   Others, such as object interaction diagrams, document collaborations at the method level.

Once the system is implemented, Smalltalk code becomes the fine grain embodiment of design; each message sent within a method is a method level collaboration.  To the extent that the fine grain design is complete, there is a one-to-one relationship between messages described in code and fine grain method level collaborations described in the design.  However, the fine grain design is seldom either complete or accurate.  In many projects the senior programmers design on a white-board, produce the program, then produce a retrospective design to satisfy management.  Even when relatively complete designs are available, the design seldom matches the actual implementation.  Changes made to the code are often not accompanied by appropriate changes to the design because design resides elsewhere and programmers seldom take the trouble to find it and edit it.  Programmers and managers familiar with these realities consider that only trustworthy description of the design to be the code itself.

Code may be trustworthy but it is not a very readable description of design.  Code describes the interactions that will take place between the objects involved in its execution but it does not explicitly specify the identity of these objects or their placement in the inheritance hierarchy.  In terms of theatrical scenarios, the code precisely defines the dialog of the scene but says little about the actors.  Yet the designer's intent in casting the actors in method-level collaborations forms a substantial portion of the knowledge needed to understand, reuse, or modify an object-oriented system.

The problem of missing design is further compounded when inexperienced developers are part of the development team or are expected to maintain the resulting system.  Commercial Smalltalk systems provide unusually rich tools to help developers deduce collaborations from code.  There are tools for browsing senders and implementors, stepping through code in a debugger, and inspecting arguments at runtime.  One of the skills common to experienced Smalltalk developers is the ability to use those tools (together with accumulated knowledge of Smalltalk base classes) to quickly and effectively understand existing collaborations.  Experienced developers also tend to follow good coding conventions and style which helps readers of their code to understand the collaborations.  Inexperienced developers, on the other hand, are unfamiliar with the existing base classes, unaware of many telltale conventions, and not skilled with the investigative tools.  To compound this problem, inexperienced developers usually fail to adhere to good design and coding conventions and good style.  The components they develop are more difficult for both novice and experienced developers to fathom.  So the larger the project and the less experienced the developers, the more vital it is to capture and maintain the fine-grain design decisions that are otherwise only embodied in the code.

2.1  Know the Cast by their Signatures

Rather than asking downstream users of Smalltalk code to deduce this fine grain collaboration information each time they browse a method, developers should include that information with their code.  In other words, for each little scene (i.e., each method), the programmer should identify the actors.

This paper proposes that developers add a formatted comment, called a signature comment (or simply a signature), to each method.  The signature characterizes the object intended by the design to be returned from the method and the objects intended as arguments, if any.

Signatures document method-level collaborations in a standard and portable way that will enhance the effectiveness of project development staff.

Improved method-level documentation will pay additional dividends when code is harvested and subsequently reused. Since the proposed signature process requires no special tools it can be quickly adopted in projects and in training programs.  And since the benefits of using signatures begin accumulating immediately, we recommend that it be standard practice to include a signature in every method.

3.  Method Signatures

The term signature refers to design information about a method. What is proposed in this paper is the simplest sort of signature -- one that defines in a formal way the kind of object intended to be returned from the method and, if the method has arguments, the kind of objects intended to occupy the arguments.   The "kind of object" is specified by a qualifier (described in the next section).  Some discussion about extensions to signature information appears at the end of this paper.

A signature is an organized collection of qualifiers that is entered into each method as a formatted comment.  It should be inserted between the method’s message pattern and the normal method comment (see examples below).

A signature for a unary method (i.e., one without arguments) simply documents the qualification of the object returned from the method.  Its format is:

For methods with arguments, the qualifier for each argument is prefaced by the name of the argument with an appended colon.  The argument qualifiers are separated by commas. The following examples illustrate signatures for methods with and without arguments.  These signatures also illustrate some common kinds of qualifiers:  those that place an object in an inheritance hierarchy, those that specify certain special objects (e.g., true) and those that include a list of alternatives.

3.1  Example -- Appraisal >>asText

asText
    "<^hierarchyOf String>"
    "Answer myself rendered as text."
    | stream |
    stream := WriteStream on: String new.
    self presentAsTextOn: stream.
    ^stream contents
The signature for this unary method asserts that the method returns some kind of String.  Experienced Smalltalk programmers who are familiar with the idiom of Streams would have little trouble deducing that this method returns a string.  Novices who are not familiar with Stream protocol would not necessarily reach that conclusion quickly even though the method has only three lines of code.  In any case, it is quicker to read the signature than it is to deduce the return qualifier from the code.

3.2  Example -- Magnitude >>between:and:

between: min and: max
 "<min: hierarchyOf Magnitude, max: hierarchyOf Magnitude,
  ^(true | false)>"
 ^(min <= self) and: [self <= max]
The signature for this method asserts that min and max are expected to be kinds of magnitudes and the method returns true or false.  Note here that the readability of signatures is as important as that of code.  Long signatures should be folded onto multiple lines.

4.  Qualifiers

A Qualifier characterizes the objects that are qualified to occupy a variable given the role the variable plays in the design of the method.  As such, the system of qualifiers proposed here is an OO type system from the clients’ viewpoint [Den91].  We use the term qualifier rather than type to avoid some of the confusion and debate about just what is an OO type.

Several OO type systems have been published (some of which will be discussed in a later section).  They differ as much in the characterization of the problem they are attempting to solve as they do in the proposed solution. Among the different goals these systems address are:

The distinctions between these goals are not necessarily sharp and most of the published OO type systems address more than one.  In general though, the above list is in order of increasing focus on formal methods.  Increasing formality, rigor, and abstraction tends to be accompanied by decreasing usefulness for human-to-human communication. The qualifiers proposed here address the human side of the human/machine interaction.  Their primary goal is to document design intent in real-world Smalltalk applications.

The  most commonly encountered OO type system is the one in C++.  C++ types are relatively simple because messages may be sent only to objects that inherit from a common class, collections may hold only objects that inherit from a common class, and classes don't exist at runtime.  Smalltalk is fully object-oriented and fully polymorphic.  In Smalltalk any object that understands the necessary messages may take part in a collaboration without regard for inheritance, Smalltalk collections can accommodate elements of arbitrary (and varying) types, and Smalltalk collaborators may be classes themselves.  All of these issues are handled by the proposed qualifier system.

The syntax of qualifiers and signatures is intended to balance the competing issues of human readability, verbosity, expressiveness, and machine parsability.   Most frequently the intended collaborators and returned objects are accurately described as domain objects that inherit from a given class.  Also common are the familiar pseudo variables: true, false, nil, or self.  These simple cases can be qualified simply and easily.  When  the idiosyncrasies and complexities found in actual Smalltalk usage exceed the reach of these simple qualifier,  composite qualifiers -- qualifiers built from groups of other qualifiers -- can be used.

To reduce the burden of entering the qualifiers, the most frequently used qualifiers have alternate short forms (indicated below by the underlined bold characters).   However qualifiers should generally be spelled out unless the resulting signature itself becomes unwieldy.

4.1  Class Qualifiers

The most common case is one in which the designer specifies acceptable objects in terms of their membership in a class or in a hierarchy of classes.   The name of the class (exactly as it would appear in code) follows an indicator of how the qualified object relates to that class.

 4.2  Special Qualifiers

Some common cases, determined both by the definition of Smalltalk and by conventional usage, deserve special qualifiers to reflect that usage.  
Code  Appropriate Qualifier
^self ^self
^self new  (in a class method) ^hierarchyOf self
^self class ^myClass
^self class new ^hierarchyOf myClass
 
 

4.3  Qualifier Aspects

In some cases an object cannot be adequately characterized without also characterizing other objects that it "contains" or "controls."  To qualify a collection, for instance, one usually must not only specify the type of collection, but also the type(s) of objects contained in the collection.  In such cases we allow a comma separated list of aspect qualifiers, delimited by curly brackets, to be appended to the qualifier. The predefined aspects in common usage are given in the following table.  The default aspect is to be assumed in qualifiers in which the aspect is not explicitly given.
 
Qualified Object Predefined Aspects Default Aspect Qualifier
Collection of: any
Dictionary, Association key:, value: any
Point x:,  y: hierarchyOf Number
Stream on: hierarchyOf String
EXAMPLES The default aspect qualifiers for Points and Streams cover most normal usage.  Only rarely do domain designs use points to hold pairs of objects other than numbers, and when they do, Associations would usually be a better pairing device.  Streams on collections other than Strings can be very useful, but that usage is relatively rare.  So aspects are seldom needed for these classes.  Collections, Dictionaries and Associations, on the other hand, typically are not used to hold or associate arbitrary objects.   The default in these cases is not chosen because it is the typical case but because it is all that is reasonable to assume if the aspect qualifier is missing.  Good practice dictates the use of aspects on all Collections, Dictionaries and Associations even if the default happens to be correct.  Points and Streams should specify their aspects only if the default is not correct.

Syntactically, aspects are arbitrary annotations to qualifiers.  The user can make up new aspects if needed.

4.4  Behavioral Qualifiers

4.4.1  Block Qualifier

Blocks may be received as arguments and less frequently returned from methods.  When used in that way, they are essentially unnamed methods.  To describe them adequately requires the same information about the block as is provided in a method signature.  So block qualifiers resemble method signatures.

Block qualifiers differ from method signatures in that they are enclosed in square brackets (as are the blocks they describe) and their arguments, if any, are named by position rather than argument name.  The format is:
 

For instance, the block qualifier that describes the argument to the Collection>>#select:  method would be
  which indicates that the block accepts any object as its argument and returns a Boolean.

Since a block may return one of its block arguments, the return qualifier inside a block may be one of its arguments, i.e., b1 or blockArg1.

Note that the block return qualifier qualifies the result of the block (i.e., the object returned when the block is evaluated); it does not describe what is returned from the method containing the block, even if the block contains an explicit return (^).  If the evaluation of a block  simply causes the method to return some value, then the block return qualifier should be none.

EXAMPLE --  the block [^ 'some string'] would be qualified as [^none].

Note, however, that method returns from within blocks passed as arguments involve subtle semantics.  The appearance of none in a block qualifier ought to warrant a second look by the designer or programmer.

4.4.2  Signature Qualifier

Signature qualifiers describe objects in terms of the methods they support without restricting their class.  A signature qualifier states that the object is qualified if it implements a method with the given signature.  We use a format different from that of the method signature itself because the signature qualifier must specify the selector of the message and because the names of the arguments to the method are irrelevant.  So a signature qualifier looks much like the message to be sent with the arguments, if any, replaced by their qualifier.  There are three types of signature qualifier corresponding to the three types of message -- unary, binary, and keyword.  Keyword messages may have multiple keywords.  In that case, the keyword-qualifier pairs are separated from one another by commas.

unary -- < selector, ^ qualifier >
            EXAMPLE -- <asString, ^hierarchyOf String>

binary -- < binarySelector qualifier, ^ qualifier >
            EXAMPLE -- <  <= hierarchyOf Number, ^hierarchyOf Point>

keyword -- < firstKeyword: qualifier, secondKeyword: qualifier, ..., ^ qualifier >
           EXAMPLE -- <copyFrom:  hOf Integer, to: hOf Integer, ^hierarchyOf String>

A candidate clearly qualifies if the method it implements or inherits has a signature that exactly matches the required signature.   Otherwise a candidate object qualifies if its arguments and return qualifiers satisfy the following relationships to those of the signature qualifier:

Signature qualifiers are needed or useful only for highly polymorphic cases, e.g., frameworks, where classes are expected to be added about which all that is known is that they will implement the signature.   Because signature qualifiers provide very little information about the kind of object expected, they should be used only in the rare cases where a high degree of polymorphism and only a high degree of polymorphism is explicitly intended.

4.5  Composition of Qualifiers

4.5.1  Alternative Qualifier

Alternative (also known as union) qualifiers express the fact that an object meets at least one of a set of qualifications.  The set of alternatives is enclosed in parentheses and separated by vertical bars to indicate a logical or: This construct is useful when the alternative classes are in different portions of the hierarchy yet still implement the desired protocol.  For example suppose we are interested in searching through a collection of products in a mail-order company model.  The method that does the search may work equally well for searching a catalog of products or a warehouse of products even though Catalog and Warehouse classes may have no inheritance relationship to each other.  The alternative qualifier would describe that fact thus: Another common usage is (true |  false) instead of hierarchyOf Boolean for easier readability.

It is a hallowed tradition in Smalltalk to receive or return nil instead of the otherwise expected object as a flag for exceptional circumstances.  For instance a method that normally returns a price list may return nil as an indicator that there is no price list:

Note:  modern exception handling mechanisms often serve better for such purposes.

4.5.2  Conjunction Qualifier

At times an object must meet all of a list of qualifications.  That calls for a conjunction (or intersection) qualifier.  The parenthesized list in a conjunction qualifier is separated by ampersands to indicate a logical and: One situation that may call for a conjunction qualifier is when a class hierarchy qualifier needs an added behavioral requirement, e.g., which specifies only those collections that understand addAll:.  The need to use this kind of qualifier usually stems from anomalies in the structure of a hierarchy.  When possible these hierarchy anomalies should be corrected rather than papered over with complex conjunction qualifiers.

A conjunction qualifier may also be used to specify a set of signatures that the qualified object must understand.  This should be avoided except in the rare case in which it is the explicit design intent of the method that a set of signature qualifiers is all that properly characterizes candidate objects.

Note: the alternative indicator (|) and the conjunction indicator (&) may not be mixed within a single list.  In the very rare case where such a complex qualifier is needed, use explicit parenthesized groups.


5.  Signature Examples

First, a general note on style: As with code, signatures should be written with consistent style.  Long signatures should be laid out with the same care as long Smalltalk statements, i.e., placed on multiple lines with indenting to clarify the groupings.  The most common qualifiers have both a long and an abbreviated form - the long form should be preferred for simple qualifiers.  When the length of a signature or the complexity of a qualifier becomes a problem, abbreviations may aid readability.

5.1 Use of instanceOf versus hierarchyOf

Most classes inherit the behavior of their superclasses in a compatible manner, so the most common qualifier is hierarchyOf.  But methods may appear in an abstract superclass that are not intended to be used by all subclasses.  Or a subclass may override a method in a way that is semantically incompatible with its siblings and superclasses.   The Digitalk Collection hierarchy, for example, has some examples of subclasses that do not (and in fact cannot) properly implement some of the methods they inherit. Dictionaries are subclasses of Set that do not behave like sets.  So Dictionaries are clearly not acceptable replacements for an expected Set argument.  When the programmer wishes to pass a set as an argument, the qualifier should be instanceOf Set to indicate that Sets are acceptable, but subclasses of Set are not.  For instance:
Set>>intersect:
intersect: aSet
    "<aSet: instanceOf Set, ^instanceOf Set >"
    "Return intersection of self and aSet "
    ^self select: [ :element | aSet includes: element ]
Similar problems crop up in the Smalltalk-80 Collection hierarchy [Coo92].  Note: if you find it necessary to use instanceOf with classes of your own design, you should consider refactoring the classes to eliminate that necessity.

5.2 Use of argN

It is not uncommon for a method to return one of the arguments it received.  For instance:
Collection>>addAll:
addAll: aCollection
    "<aCollection: hierarchyOf Collection, ^arg1 >"
    "Answer aCollection.  Add each element of
    aCollection to the elements of the receiver."
    aCollection do: [ :element | self add: element].
    ^aCollection
Here aCollection is both the argument and the object returned by the method.  So the return qualifier could simply be the same as the argument qualifier and be technically correct.  However, if it is qualified as arg1 rather than hierarchyOf Collection, the return qualifier retains the identity information.  This is an important point to document and future static analysis tools will be able to make use of such information.

5.3  Use of class qualifier

A common design pattern, known as the FactoryMethod pattern [GHJV94], relies on the class of an object being determined at runtime.  The method that decides the class must return that class.  For example a multimedia window may have some (usually private) service method to provide the class.  In that case, the return qualifier specifies a class rather than an instance.
Viewer>>viewerClassForDocument:
viewerClassForDocument: aDocument
    "<aDocument: hierarchyOf Document,
    ^hierarchyOf class DocumentViewer>"
    "Answer the proper class to view this kind of document."
    ^ViewerClasses at: aDocument ifAbsent: [GenericViewer]

5.4  Use of a block qualifier, myClass, any, true, false

The collection hierarchy, in part because it supports such a wide range of use and in part because it contains some of the oldest classes in the Smalltalk image, requires some of the most complex kinds of qualifiers.    Abstract behavior intended to be inherited in a vary general way can be difficult to qualify.  Consider:
Collection>>select:
select: aBlock
    "<aBlock: [blockArg1: any, ^(true | false)],
   ^instanceOf myClass>"
    "For each element in the receiver, evaluate
      aBlock with that element as the argument.
      Answer a new collection containing those elements
      of the receiver for which aBlock evaluates to true."
    | answer |
    answer := self species new.
    self do: [ :element |
        (aBlock value: element)
            ifTrue: [answer add: element]].
    ^answer
The argument, aBlock, must be a one argument block  whose value is true or false.   Nothing can be said of the argument to the block; it can be any object.   The return qualifier presents a problem that is not completely representable with the present qualification system.  The method returns a new instance of a collection that is almost always of the same class as the receiver.  The return qualifier is therefore instanceOf myClass.  However the collection isn't created by 'self class new' it is created by 'self species new'.  For almost all classes, species returns the same thing as class.  The exceptions in the Digitalk VOS/2 image are:  Interval for which species returns Array, Symbol which becomes String, SymbolSet which becomes Set, and DoubleByteSymbol which becomes DoubleByteString.  For the current focus of documenting design intent, these exceptions are too rare to warrant an additional qualifier type (e.g., mySpecies).

5.5  Use of a qualifier modified by an aspect

MethodArtifact>>comments
comments
    "< ^instanceOf OrderedCollection {of: hierarchyOf String}>"
    " lazy getter so that comments are extracted only once."
    comments isNil
        ifTrue:
            [comments := OrderedCollection new.
             self extractComments].
    ^comments
Note that comments is an instance variable of the MethodArtifact class.  Signatures for getter and setter methods serve also to document the qualification of the associated instance variable.  When proper tool support is available, instance variables should also have qualifiers.  Until then, qualifiers in the setter and getter methods document the same information.  Note also that instanceOf OrderedCollection is used rather than hierarchyOf because the OrderedCollection class is referenced explicitly so we know its class exactly.
Laissez faire getters, such as the example above, guarantee to initialize the instance variable with the proper kind of object.  If the variable has not been initialized, it will be nil, which will trigger the initialization code contained in the method.  With other methods of initialization that are less reliable, there may  be a question of what the proper return qualifier for a getter method should be.   From the perspective of type-safety, the fact that the method might return nil  could tempt one to add nil as an alternative to the getter’s return qualifier.  However bugs, no matter how common, are not a part of intended design.  From the perspective of documenting design intent, nil should be included as an alternative only if it is the intent of the designer that nil be a legitimate return value from the getter.

5.6  Use of a signature qualifier in a conjunction

A signature qualifier can be used in conjunction with a hierarchy qualifier to restrict the qualified object to those classes that support the required behavior.  The need for this is very rare but one case might occur in methods that accept arguments qualified by hierarchyOf Collection .  Not all collection classes in the Digitalk collection hierarchy support collect: even though the method is implemented in Collection and therefore inherited by all subclasses.

This is not at all obvious from reading the code for collect: which is quite general.  The subtle problem is that collect: requires that the receiver (or more technically, the species of the receiver) implement add: in a way that any object can be accepted as the argument.   Yet Dictionaries and their subclasses require that the objects being added must be Associations (or more precisely that they understand key and value).

In most real-world applications, a method would not be qualified to accept any arbitrary collection and, therefore, this problem would not occur.  In the rare case where the problem might occur, the argument qualifier could specify that  the argument is some kind of collection but that it also must support add: with any argument.   That is expressed with a conjunction qualifier:

     (hierarchyOf Collection & <add: any, ^arg1>)
An object whose add: method would be qualified as <add: hOf Association, ^arg1> does not meet the requirement imposed by the qualifier <add: any, ^arg1>hOf Association is more specific not less specific than any (see the discussion in the signature qualifier section).

5.7  Use of none when the method does not return

Methods that simply signal an error are most often found in abstract classes.   For example:
FixedSizeCollection>>add:
add: anObject
"<anObject: any, ^none >"
        "Add anObject to the receiver.  This method reports
         an error since fixed size collections cannot grow."
    ^self invalidMessage
More commonly, methods may signal an error under certain circumstances and return in others.  Such methods may use the none qualifier as one member of an alternative qualifier.

6.  Focus on Design Intent

The signature and qualification system presented here borrows from a number of prior systems for OO typing.   For instance, Suzuki, [Suz81], first proposed alternative (or union) qualifiers.  We borrow most from Borning & Ingalls, [BI82] who pioneered the notions behind what are here called special qualifiers (in particular, class, self, and argN), aspect qualifiers, signature qualifiers, and block qualifiers.
Type systems differ in detail and in the purpose for which the systems were developed.  Borning & Ingalls focus on documentation of design with some support for compile time type safety.  Johnson and Graver, [Joh86], [GJ90], focus on type safety and compiler code optimization.  Thomson  [Tho93] focuses on documenting the actual effect of the code (for portability and reverse engineering of existing code).  Wills [Wil91] focuses on substitutability of components and formal provability of correctness.  Cook [Coo92] focuses on documentation and factoring of existing Smalltalk Collection classes.  The Trellis/Owl system [SCB86] focuses on documentation and type safety.  America [Ame90] focuses on behavioral description and formal correctness.  It is important to recognize that the various goals of OO type systems have differing degrees of relevance and value at different stages in the life cycle of OO development, harvesting, and reuse.
 
Lifecycle Stage Important Type System Issues
first stage development documentation of design intent
refactoring documentation of actual use, type safety
harvesting and reuse design intent, type safety, component substitutability
certification for reuse type safety and proof of correctness

The system proposed here is intended to document design intent and to support some aspects of type safety.  We believe that the burden of developing reusable code cannot realistically be placed on designers or developers in the initial project.   A phase one project necessarily focuses on modeling the domain and getting a complete running application.  Generality and the needs of reuse are difficult to foresee in early projects.  The well known rule-of-thumb is that components (classes, hierarchies, frameworks, and subsystems) must be refactored in light of other contexts two or three times before they become generally reusable assuming, as Adams points out [Ada92], that they are reuseful in the first place.  If signature documentation required the designer or programmer to understand in advance how the components will be reused, the qualification system might languish unused.

Design intent, although forgotten all too quickly, is known when the method is written and, if captured then, becomes increasingly valuable at each later stage of the lifecycle.  Yet design intent may conflict with the actual code.  When that occurs, the conflict must be resolved.  Either or both of the design or the code may be incomplete or inaccurate.  Only the designer/programmer can make the call.  Automated type checking and type inference systems can provide input to the human referee when mismatches are detected but cannot replace the need for documenting design intent.

This is not to say that we are forever done once design intent has been documented, just that the process cannot properly begin without capturing design intent.   Each successive stage needs to refine prexisting design information and augment that information.  Once project components are harvested the appropriate focus of a type system will change to one of describing highly polished reusable classes.   As components are harvested, mature, are refactored, and gain in generality, issues of correctness, reusability, and substitutability may equal in importance issues of the documentation of design intent.  It is not clear at this time whether that additional information can be captured in a format similar to the signatures and qualifiers presented here or will require a qualitatively different format.


7.  Next Steps

7.1  Adoption by Smalltalk Projects

We believe that signature documentation will provide immediate benefits in new and ongoing projects.  Project leaders, designers, programmers, and the customer will benefit by the capture of design information that otherwise would be lost.  Signatures in everyday use will also improve communication between designers and programmers, between teams of designers and programmers, and between individuals on a team.  Smalltalk programmers spend far more time reading code than writing code.  Of the time spent reading code, a not inconsiderable portion is spent deducing the requirements of the arguments and deciding what kind of object is returned.  Inexperienced Smalltalk programmers spend proportionally more time trying to understand the code.   Where accurate signatures are provided, programmers will find immediate productivity improvements.

Adoption of signature documentation does, however, require some changes to everyday work practices.  And these changes may be seen, especially by experienced Smalltalkers, as an unwarranted burden. The new burdens placed on the programmers include the thought required to provide the good qualifiers, the additional typing to enter them, and the need to change the signature if the method changes in ways that invalidate the signature information.  The need to enter a signature for every method is indeed a burden, though usually a small one.  Some simple tool support should be available in 1995 to minimize that burden (see later section).  The thought required to decide the proper qualifiers may be perceived as a burden, but experience has shown that it is a benefit in disguise; typically the extra thought improves the design.  Maintaining accuracy when methods are edited is the key issue.  Eventually tool support will be available to signal mismatches between the signatures and the code.  Until that time, programmers will have to be motivated to update their signatures by management oversight, peer pressure (both ad hoc and during code reviews), and professional pride.

Newly trained Smalltalk programmers can and should learn to use signatures (see next section).  Experienced Smalltalkers will need to learn on the job, perhaps in the midst of a project in which they already seem swamped.  We believe that a transition to the use of signatures can be made in mid-project with little impact and that the improved documentation of design will improve productivity enough to repay the cost of entering the signatures.  Nonetheless,  the engineers, together with their project leaders, will have to weigh the trade-offs in light of the details of their project.

7.2  Adoption in Training Programs

Training in signature documentation better prepares students to use signatures when they take their place in Smalltalk projects.  Signatures can also provide immediate pedagogical benefits in the classroom.  Signatures can clarify an issue for trainees that is often confusing for those new to Smalltalk:  just what objects are returned from methods.  Also, the ability to clearly specify and reason about fine grain design will facilitate the teaching, discussion and review of fine grain design issues.

7.3  Evaluation and Evolution of the Signature Process

The effectiveness of this system in real use needs to be assessed in the first few months of deployment.  The completeness, expressiveness, usefulness, and correctness of the qualification system can only be judged after real-world use in the trenches.  We expect to modify, enhance and provide better guidance about standard usage patterns after signatures have seen some use in projects.

We already are aware of a number of areas that can and will receive attention in future iterations of the signature system.  Additional information may be included in signatures, signatures may be used in additional ways, and additional tools will be needed:

The most important longer term issue is support for assuring the accuracy of the signatures.  The qualifier syntax is designed to be parsed by future tools that can provide a static analysis of the match between the collaborators described by the qualifiers and the collaborations implicit in the method's code.  Tools may also make use of signature information to extract larger scale design -- for example to roll-up method collaborations into class or even subsystem collaborations, and to present these collaborations in interaction diagrams and collaboration nets.

8. References


9.  Appendix: Quick Reference to Signatures

9.1  Method Signatures

For unary methods (which do not have arguments) only the return qualifier is specified:
 < ^ qualifier>
For methods with one or more arguments, the signature adds a qualifier for each argument keyed by the argument’s name:
 < arg1Name: qualifier, arg2Name: qualifier, ..., ^ qualifier>

9.2  Qualifiers

9.2.1  Class Qualifiers

9.2.2  Special Qualifiers

 
Code  Appropriate Qualifier
^self ^self
^self new  (in a class method) ^hierarchyOf self
^self class ^myClass
^self class new ^hierarchyOf myClass
 

9.2.3  Qualifier Aspects

qualifier {aspectName1: qualifier, aspectName2: qualifier, ... }
The predefined aspects are:
 
Qualified Object Predefined Aspects Default Aspect Qualifier
Collection of: any
Dictionary, Association key:, value: any
Point x:,  y: hierarchyOf Number
Stream on: hierarchyOf String
 

9.2.4  Alternative Qualifier

( qualifier | qualifier | ... ) -- one or more of the qualifiers must apply

9.2.5  Conjunction Qualifier

( qualifier & qualifier & ... ) -- all of the qualifiers must apply

9.2.6  Block Qualifier

 [ :blockArg1 qualifier, :blockArg2 qualifier, ..., ^ qualifier ]
Since the return result of a block may be one of its block arguments, the return qualifier inside a block may refer to its arguments, e.g., ^ b1 or ^ blockArg1.  Note: the return qualifies the result of the block, not a method return.

9.2.7  Signature Qualifier

unary -- < selector, ^ qualifier >
binary -- < selector qualifier, ^ qualifier >
keyword -- < firstKeyword: qualifier, secondKeyword: qualifier, ..., ^ qualifier >