Software and Documentation

I used to pass by a large computer system with the feeling that it represented the summed-up knowledge of human beings. It reassured me to think of all those programs as a kind of library in which our understanding of the world was recorded in intricate and exquisite detail. I managed to hold onto this comforting belief even in the face of 20 years in the programming business, where I learned from the beginning what a hard time we programmers have in maintaining our own code, let alone understanding programs written and modified over years by untold numbers of other programmers. Programmers come and go; the core group that once understood the issues has written its code and moved on; new programmers have come, left their bit of understanding in the code and moved on in turn. Eventually, no one individual or group knows the full range of the problem behind the program, the solutions we chose, the ones we rejected and why.

Over time, the only representation of the original knowledge becomes the code itself, which by now is something we can run but not exactly understand. It has become a process, something we can operate but no longer rethink deeply. Even if you have the source code in front of you, there are limits to what a human reader can absorb from thousands of lines of text designed primarily to function, not to convey meaning. When knowledge passes into code, it changes state; like water turned to ice, it becomes a new thing, with new properties. We use it; but in a human sense we no longer know it.

Ellen Ullman
Salon Magazine, May 13, 1998
See also Salon's Interview with Ellen Ullman
Ellen Ullman is the author of the book, Close to the Machine: Technophilia and its Discontents, City Lights Books, 189 pp.


Like many Internet startups, @Home had been lax about documenting its software code; as employees moved on, it became impossible to make changes or create new applications without months of work and constant crashes.
The $7 Billion Delusion, by Frank Rose, WIRED Magazine, January 2002


A novice asked the Master: ``Here is a programmer that never designs, documents or tests his programs. Yet all who know him consider him one of the best programmers in the world. Why is this?''

The Master replies: "That programmer has mastered the Tao. He has gone beyond the need for design; he does not become angry when the system crashes, but accepts the universe without concern. He has gone beyond the need for documentation; he no longer cares if anyone else sees his code. He has gone beyond the need for testing; each of his programs are perfect within themselves, serene and elegant, their purpose self-evident. Truly, he has entered the mystery of Tao."
The Tao of Programming by Geoffrey James (published on Kragen Sitaker's web site). Since many programmers believe that the software they write is self-documenting it is worth noting that Geoffrey James's The Tao of Programming is satire.

Introduction

This web page is a mediation about how organizations can develop software that costs less to maintain. Two years ago, when I read Ellen Ullman's beautiful essay in Salon, I copied the paragraphs above onto this web page. I always thought that I would use her words as a starting point for a web page on software documentation. I was finally prompted to write this web page because I've been thinking a lot about documentation recently. I've been wrestling with an older piece of software that consists of around fifteen thousand lines of C++ and little in the way of comments or documentation. I've fixed several bugs in this code, but it has been very slow going. In some cases it has taken a couple of weeks to fix a simple problem.

The code I've been struggling with is certainly not the worst I've seen. It uses object oriented design and the people who wrote it are very bright. But there is obviously a problem since it takes so much work to fix what appears to be a simple flaw. As I've struggled with this software, I've thought more about what can be done in software projects to develop code that costs less to maintain.

Part of the problem is documentation. This code has few comments and little external documentation. When I started writing this essay I was going to emphasize that the key to more maintainable software was documentation. If the code had clear comments and some external documentation that described the higher level architecture, I might have been able to understand it sooner and fix bugs in less time. As I wrote this essay and thought about the issues, I realized that the solution is not so simple. Documentation is critical to maintainability. But what form should the documentation take? How can engineers be encouraged to write documentation, which they tend to be reluctant to do? Software structure is also important. Are there any guidelines that can be used for structuring a C++ program? Some people think that 600 line functions are fine. How can they be convinced that smaller functions that decompose program functionality are better?

I've seen two extreme views on how to develop better software. On one pole there are the software project management consultants. They never seem to doubt that they have the answers. On the other pole are the cynics who believe that there is no solution, life is hard and we just have to struggle through the unchanging morass. My views fall somewhere in the middle. I would like to think that twenty years of working on large software development projects has taught me something. But I don't believe that there are any final answers. Certainly what I've written here falls far short of a radical solution. There will never be a perfect software project that develops the perfect piece of software. But I believe that we can always do better.

Wandering without a map

Although it is not my favorite activity, over the years I have done my share of software maintenance, both as a consultant and as a staff engineer. Software maintenance involves fixing bugs and adding new features. In many high tech companies, here is frequent staff turnover (sometimes as much as 20% a year). The person who wrote the software may no longer be at the company. Frequently the software is poorly designed and constructed and in almost every case the software is undocumented. When software is poorly constructed and there is no documentation, even simple bugs can take days to fix. Changes to fix a problem may interact with other parts of the software in unexpected ways and introduce new bugs. New features take even longer to add and have an high probability of breaking the software.

Unless management enforces design and implementation documentation, software frequently has no description outside of the code itself. Software tends to be undocumented because most software engineers hate to write. They majored in computer science so they would not be required to write clear English (I should know: I took as few courses outside of science and math as possible). In some cases English is not the engineer's first language and they are, at best, awkward writers. Engineers are almost as reluctant to comment their code as they are to write external documentation.

Nothing I've written so far, nor anything else I will write here, will be new information to anyone who has worked on commercial software. The importance of clean, simple, software design and documentation describing the software has been discussed for at least thirty years. Most managers and software engineers know that the majority of the cost of a software system is not in development but in maintenance. This fact has been used by the open source software community to argue that software should be given away free. Organizations should only charge for maintenance and enhancement, the open source theorists claim.

Back in 1997 I was told that it was because the code was meant to be "self documenting", that is, it was meant to be plain enough that you didn't need comments to understand it and comments got in the way and made the code more difficult to read.
Kamelion, posting in a slashdot.org discussion of the book Cube Farm by Bill Bluden

The importance of documentation should be obvious. Yet software projects are still started without requirements documents and implemented without any design documentation. I still hear software engineers tell me that the source code is "self documenting".

What is to be done?

The development of large software systems is one of the most complex tasks humans undertake. Software projects frequently exceed their schedules by 50% or more. When the software is delivered it is full of bugs and does not have all of the necessary functionality.

One of the largest purchasers of custom software, the US Department of Defense (DoD), attempted to address the problems of software project failure and cost overruns by mandating extensive external documentation. DoD sponsored the development of a programming language, Ada, and mandated its use on all DoD projects. While these steps yielded some benefit, they did not solve the problems. Projects still failed to deliver usable software and overran budgets. The resulting software was still frequently difficult to maintain. No one uses Ada today unless they are forced to.

Books like The Witch Doctors: Making Sense of the Management Gurus, by John Micklethwait and Adrian Wooldridge describe how management consultants promote various management fads (management by objective, corporate re-engineering, emotional intelligence and countless more), selling their services to shallow corporate executives.

Computer software is similarly driven by fads. In the 1970s and early 1980s there was structured programming (no gotos). In the later part of the 1980s it was modular design (e.g., Ada and Modula-2). In the 1990s and the early part of this century the current solution to our problems is object oriented design and design patterns. Some of the newest fads are "Extreme Programming" and the use of the Universal Modeling Language (UML). Software management consultants sell each new fad as "the solution" which will cut a path through the swamp of software development. The managers who pay these consultants attempt to impose these fads on the engineers that report to them. Sadly nothing changes (except that the software management consultants find a new fad to promote when the current fad gets stale). The truth is that there are no simple solutions. There are no magic bullets.

This does not mean that the situation is hopeless. Although there are no simple solutions, there are some steps that can be followed that are more likely to lead to reliable software that can be maintained at less cost.

  1. Develop a requirements specification.

  2. Develop a design specification.

  3. Comment the code.

  4. Develop well structured software.

  5. Change the nature of the Quality Assurance Group

  6. Software should be maintained by the person who developed it.

The fourth point, "Develop well structured software" is the most difficult to address. What is well structured software? Software that is clearly constructed and extensible (sometimes conflicting goals) is critical to maintainability. What rules and design principles should be used? Book cases full of books have been written on this topic. There are a number of solid principles that can be used in writing software, but some of software design is esthetics. What is clear, well structured code to one person may not be to another. When we get to C++ or Java object design the problem becomes even more complex. Objects can provide abstraction, but they can also hide the detail we need to understand the code (see Thomas Niemann's essay Nuts to OOP, published in the August 1999 issue of Embedded Systems Programming). There are whole taxonomies of horrors that can be committed in C++. This issue is too large for this web page. Rather than write something shallow and meaningless, I am reluctantly leaving it out.

Requirements Specification

A requirements specification describes the functions the software should perform. The idea that you can build something only if you know what you are supposed to build should be obvious. Imagine walking into the office of a General Contractor and saying "I want you to construct a building for me" without providing the contractor with plans or other information. No one would ever do this, nor would any reputable contractor accept such a job. Yet software is frequently built without any specification. This may be acceptable for a research project, but it usually fails for product quality software.

Most people know that jumping into development without doing requirements and design documentation will not save time in the long run. What ever time might be saved by starting development immediately is lost because the quality of the software is low and it is difficult to change. But people still start development without knowing where they are going.

During the two years or so when the Internet boom was strongest I talked to a number of start-up companies. Many of the people who ran these companies felt that they were operating on "Internet time". Product development and release schedules consisting of only a few months, delivering buggy prototype software (e.g., the early versions of Netscape and Internet Explorer) where viewed as critical for survival. Along with the sleep, sanity and personal relationships of the staff, one of the first things to be sacrificed was any kind of documentation. There was no time to develop documentation. The software would be developed as fast as possible and shoved out the door. One of the most famous examples of this attitude toward software development was the Netscape browser. The software became unmaintainable and was discarded by the Mozilla group (see Maoist software development, below).

The requirements specification should clearly state what purpose the software is supposed to fulfill. Usually the high level requirement can be defined in a few sentences: "This software will control baggage handling at the Denver International Airport. The software will manage baggage routing to and from the airplanes via a set of automated baggage handling systems." The rest of the requirements document elaborates on the functional details covered in summary paragraph.

Sometimes a requirements specification may be very simple: "This compiler will implement ANSI C++" and reference a complex external document like a language or networking standard.

Design Specification

The design specification should describe the overall architecture of the software. For example, if it is a client/server application it should describe which functions are supported in the client and which functions are supported in the server. If it is a class library, the interface should be described. If it is financial software, the formulas used should be described. If the software interacts with other software, this interaction would be described at a functional level.

Lower level implementation choices may be described as well. For example, if it is a distributed application using CORBA, the reasons for chosing CORBA would be discussed, along with the CORBA architecture.

Comment the code

Comments in code explain the code to the reader and make it more likely that the code will be correct. By writing the code and then explaining the logic behind its operation in a comment, the programmer writes the algorithm twice and may discover errors in the design or implementation.

The concept that extensive and well maintained comments in the code could reduce the cost of software maintenance and reduce the number of new errors introduced when the code is changed should be not be controversial. Yet for twenty years I have heard software engineers say "the code is self documenting". That is, all the reader has to do is read the code to understand the software. The unstated implication of those who claim that their code is self documenting is that anyone who does not understand it is either stupid or an incompetent software engineer. This neatly moves the responsibility for understanding from the author of the code to the person who has to maintain or extend it.

While it is true that any experienced programmer who is fluent in the programming language used to construct the software can understand what the software does, it can take a great deal of time to understand why the software is structured they way it is. Comments are not there to describe the obvious functions of the code, but to explain the software architecture and the interactions of the components.

As far as I can tell, source level documentation does not seem to be an important issue to the software development community, in general. For example, the Mozilla group which is developing the open source version of Netscape, has fairly extensive guide lines that discuss the code review and the bug fix process. There are also guidelines on writing portable C++ code. The Mozilla project also includes a lot of external documentation. Some of this documentation is excellent. But no where did I see anything about commenting the code. This is somewhat surprising, since the Mozilla source base is massive and new people will join the project over time. As the source base changes the external documentation will tend to become obsolete. Without accurate comments in the code (which can be maintained along with the code) new developers are more likely to make mistakes when adding new features or bug fixes. Perhaps there is some unstated hacker view that "real programmers" don't need comments.

In his paper More Than a Gigabuck: Estimating GNU/Linux's Size, June 20, 2001, David A. Wheeler estimates that the Red Hat Linux 7.1 release of GNU/Linux consists of over 30 million lines of source code. This includes development tools, like the GNU C/C++ compiler gcc, in addition to the Linux operating system itself. The Linux kernel and drivers consist of 2.4 million lines of code, of which over half is device drivers.

This is a staggering amount of code. It is a great deal of work to understand a software system consisting of a few tens of thousands of lines of code. Few people can fully understand a Linux subsystem consisting of 100K lines of code. Although there are books that document the architecture of the Linux source, the actual detail in such a massive body of software can only be documented in the code itself. As with most software, the Linux release is poorly documented and comments are few and far between. Uncommented code is a huge unaddressed problem for the open source community, where a massive source base is modified by hundreds of people.

The sad truth is that some software engineers comment their code and others do not. As a manager I've found it very difficult to get the people who do not comment their code to start doing so. I have extensively commented my own code and hoped that others in the group would be motivated by this example. I have even mentioned code documentation in yearly performance reviews. But my efforts failed. Clearly commenting code is a significant amount of work.

An engineering manager that believes that they can simply issue an edict that software will, from now on, be documented is deluded. In the end no one can be forced to do anything. This is doubly true of engineers. Especially when there is a shortage of software engineers, we can't simply have those who don't document their code shot by the cheka. Some companies have tried "documentation by edict", enforcing a certain number of lines of comment per line of executable code. In most cases this just forces the creation of meaningless comments. The objective of maintainable code cannot be reached by fiat.

If documentation cannot be arrived at by command from on high, what can be done? First there needs to be general agreement in a software development group about what comments should cover. When I've talked to people about documenting their code I find that often there is no common understanding of what should be documented. Comments should describe the software architecture and the reasons why the code functions as it does. Comments should provide a higher level view of the program function. The low level view is provided by the code itself. Comments like

  // don't comment like this:

  x = x + 1;  // add 1 the variable x

are useless, since it is obvious what this statement does. A comment should describe high level function. For example:


/*
 * change_to_imm_form
 *

   When an instruction has an immediate form, this function attempts
   to substitute an immediate for a register operand.  This
   can reduce register usage, since it can reduce the lifetime
   of a virtual register.  For example,

        add V2, V3, V5

   can be transformed into

        addi V2, V3, 42

   if virtual register V5 has previously been loaded with the short
   immediate value 42.

   This function also looks for symmetric instructions (e.g., add, and,
   or, etc...) where operand two and three can be exchanged in the
   case where operand two is a constant.

 */

Once people have some idea about what comments are supposed to cover, a structure needs to be put in place to encourage them to comment their code. Code reviews can provide this motivation. By allowing the engineers peers to review the code, there is peer pressure to add comments. The reviewers can also provide feedback on whether the comments provide enough detail to understand the code.

In the early 1980s code reviews were proposed as a way of improving software quality. The idea behind code reviews was that if more than one person reviewed software source, more errors could be found. In most cases code reviews fail to improve software quality. The software developer knows far more about the software architecture and design than the reviewers, who usually have not read the code before the meeting. Management can issue an edict that reviewers must "do their homework", but in the end schedule pressure wins out and no one looks at the code before the review. As a result, the code review turns into a "dog and pony show" where the software author swamps the reviewers with detail. If the software is written by an experienced developer, any errors will be in subtle areas of design, not in implementation and the reviewers are not likely to catch them. There is also always the danger that the software review will degenerate into heated arguments about style.

A better use of code reviews is to peer review the maintainability of the code. The fact that the reviewers are not as expert as the author is an advantage in this case. If the reviewers can understand something about the software architecture from reading the code and comments then someone who must maintain the code in the future can do so as well. I found it interesting that the Mozilla group now requires code reviews:

To improve code quality, mozilla.org now requires all changes to be approved by a designated Mozilla code reviewer. This extra level of review applies to everyone, including Netscape engineers.

From www.mozilla.org

Unfortunately none of the code review guidelines I saw involve maintainability of the source code.

Change the nature of the Quality Assurance Group

In many companies the development group and the quality assurance (QA) group are separate and report to different managers. This creates an environment where developers tend to view testing and software quality as the responsibility of the QA group. Developers who do not feel a strong responsibility for product quality are unlikely to develop a quality product. To avoid a division into a group that is responsible for development and a group that is responsible for quality, the QA engineers should be part of the development group.

Managing software is a significant task. Central to this is a source control system like Clearcase or CVS. There should also be a system to regularly build the software changes are checked in. Different versions should be kept on-line so that work can continue if a change breaks the software (e.g., an earlier build can be used). Each time a bug is fixed the developer that fixed the bug should write a regression test which should be added to the regression suite. Software should exist that automates running the regression suite and larger application level tests.

Configuration management, test development and test software support are significant tasks. QA engineers should be the part of the development group that concentrates on these issues during product development. Although the QA engineers should work closely with the developers, developers still have responsibility for developing regression tests and running the regression suite.

Software should be maintained by the person who developed it.

When software is released for use, the bugs found in the software will follow a "bug curve". When the software is first used there will be a sharp increase in the number of bugs found per day. At some point the number of bugs found per day will plateau. If the software is well designed the number of bugs found per day will begin to decline. At some point in the declining curve management will decide to release the software to more end users. When software is first released, bug fixing may take a significant amount of time for the development group. Well designed software will mature over time and less and less time will be needed for maintenance.

If software is poorly written, it will continue to consume significant amounts of time to maintain. Some organizations pass the software off to a maintenance group to free the developers to develop new software. This is the worst thing they can do. The software developers who developed the buggy software that takes so much time to maintain should not be allowed to escape their creation. They will only be freed to develop more bad software.

By insisting that developers maintain their own code, a certain selection will take place. Those who develop reliable software will have the time to write new software. Developers who build unreliable software will not have the time to do additional damage, since they will spend all their time fixing bugs in their existing software. They may also be encouraged to rewrite their code to make it more reliable.

Maoist Software Development: burn it down!

Over time software is like a ship that has sunk in tropical water. A diver looking at the ship when it first sinks sees the ship in all its detail. The superstructure of the ship is clean and closely resembles the ship before it sank. If the diver returns after a year there will be coral and other marine life growing on the hull of the ship. Fish and eels will have moved in to this new habitat, but the ship is still recognizable. If the diver returns after five years, they will see only the outlines of the original vessel. The superstructure of the ship will be entirely covered with coral.

Software changes in the same way over time. After a few years of changes, the original author can only recognize the general outline of the software they developed. As new functionality that was not envisioned by the original author is added, even well designed and documented code will succumb to incrustation. The software will become more difficult to maintain as its original structure becomes obscured by later changes. There will come a time when it must be discarded and rewritten. Management hates to do this because the new software must be debugged and may initially have more bugs than the existing code. But if the software is to be maintained and improved, there is no choice.

Who was Mao and Why is it "Maoist Software Development"?

But if you go carrying pictures of Chairman Mao
You ain't going to make it with anyone anyhow

From Revolution, by the Beatles, written by Lennon/McCartney

Mao was in power when I was growing up, but it occurs to me that some readers of this web page may not really know who Mao was or why I have referred to software rewirtes as "Maoist Software Development".

Mao was the leader of the Chinese communist revolution and chairman of the Chinese Communist Party. Mao and the Chinese communist party took power in China in 1949 after a long struggle for power which went back to the 1920s. Mao's strength of character and determination may have been admirable before be ruled China, but as the leader of the oligarchy that ruled China, he was a monster. In the course of his rule, Mao was responsible for the needless deaths of millions of chinese. This places Mao in the company of Hitler and Stalin as a mass murderer on a grand scale. So my use of Mao's name here is ironic, not an indication that I admire him.

Mao launched the "great cultural revolution" which was supposed to destroy the old order, revitalize China and bring about the Chinese socialist state (which is supposed to preceed the communist state). The theme of the cultural revolution was "burn down the old order and build the new one". As is the case with some software rewrites, the result of the great cultural revolution was a disaster, which scared an entire generation of chinese.

Resources and References

Code Understanding Tools

The fact that so much software is undocumented means that a new person maintaining it has a huge learning curve. A variety of software tools claim to help in "code understanding". My experience with such tools is that they can help the user understand how components are interconnected and illuminate the interdependencies. It should be obvious that they cannot shed light on the actual logic behind the design. These tools are a last resort for software that should have been well documented and clearly designed, but wasn't.

This list of "software understanding" tools is certainly incomplete. This list concentrates on tools for Solaris (e.g., Sun Microsystems' version of UNIX). Microsoft Visual C++ (or Visual .NET or what ever the currently cal it) includes a source code browser. Eclipse, provide an excellent environment for browsing Java code. I've listed only the software that I've used or encountered in my wanderings. A comprehensive list would be much larger. Inclusion in this list does not necessarily mean that I endorse the software.

Web References

Suggested Reading

Microsoft develops more software than any single company in the world (with the possible exception of IBM). The company is famous for releasing their software late, sometimes by years. Perhaps because of this experience some of the best books on software development and project management have been written either by Microsoft engineers or have been published by Microsoft press.

  1. Debugging the software Development Process by Steve Maguire, Microsoft Press, 1994

    This is an excellent book written by a project lead and software engineer who has experience with real industrial development projects.

  2. Dynamics of Software Development by Jim McCarthy, Microsoft Press, 1995.

    Another excellent work written by a software engineer with wide industrial experience.

  3. Software Project Survival Guide by Steve McConnel, Microsoft Press, 1998

    This book has some good advice, but it also strays into the realm of software management consulting with terms like "stakeholder". This is the weakest of the Microsoft books on project management.

  4. Software Runaways by Robert L. Glass, Prentice-Hall, 1998

    This is an interesting account of some disasterous projects with some analysis of what went wrong.

  5. Collection of Software Bugs by Prof. Thomas Hunckle, Institute fur Informatik, Munich

    Catastrophic software errors are a fact of life. Humans (or anything else that creates software) are imperfect and will never create perfect software (or perfect design requirements). However, careful design and documentation can reduce software errors. Prof. Hunkle's web page cataloging serious software errors demonstrates the costs of software errors in critical systems.

Ian Kaplan, September 21, 2000
Revised: February 2004


back to Notes on Software and Software Engineering

back to home page