Build It Faster In Java: the software reuse story

Sun Microsystems claims that "Java changes everything". Java is, according to Sun, a new paradigm and a revolutionary new way to create software. Software created in Java is more reliable and will run "everywhere".

In fact Java is evolutionary, not revolutionary. Nor is the Java paradigm of interpreted software new. However, the ability to rapidly develop applications in Java has been commented on by many people who do not work for Sun's marketing department. The productivity people experience developing applications in Java results not so much from the Java programming language, but from the extensive Java class library. Java allows programmers to develop software faster because it supports software reuse.

Software Reuse: the Shining Promise

As long as there have been programmable digital computer systems, software has been expensive to create. The complexity of software makes errors in design and implementation, commonly called "bugs", unavoidable. Software errors can have a high cost for the software users. Identifying and fixing software errors is also a large part of the "software life-cycle" cost for those who create the software.

The idea behind software reuse is to build new software from existing software components that have been carefully designed and debugged over time. In cases where software reuse is successful, programmers can create more reliable software faster, which is another way of saying that the software costs less.

Like any idea that promises to reduce software life-cycle costs, software reuse is an idea beloved by managers and business consultants. Although software reuse is a powerful idea, large scale software resue has actually been achieved in only a few cases. The cost of software reuse (and software quality) is rarely discussed. Although reusable software can save a great deal of money, it has a real cost. Reusable software is difficult to design and implement since in many cases it takes deep insight to produce a reusable software package. The software must also be clearly documented or it cannot be reused.

Software Reuse: A Historical View

The earliest examples of software reuse come from the scientific and engineering community and include the Fortran function libraries for the standard elementary functions (e.g., sin, cos and sqrt) and linear algebra packages like LINPACK and its successor, LAPACK.

Software engineers fell in love with the UNIX operating system in part because it provide a rich environment for software reuse. Developing software on UNIX was easier and faster than on proprietary operating systems from companies like IBM and Digital Equipment Corp (which was purchased by Compaq). UNIX has been widely embraced in the scientific and engineering community. UNIX systems have also been widely used in business applications like financial market trading systems.

UNIX provides a large set of standard utilities (e.g., grep, awk, shell scripting languages, lex, YACC, etc..). Using scripting languages like the command shell languages Sh or Csh or the more powerful Perl or Python languages these utilities can be combined with features provided by the scripting language to rapidly create new applications. UNIX also provides a standard set of system calls and library functions which can be referenced from C or C++. The UNIX system calls and library functions became the base for the POSIX library.

From the end of the second world war to the fall of the Berlin Wall, the United States Department of Defence (DoD) was one of the largest (if not the largest) customer for and developer of software in the world. Many of the DoD funded software systems were in critical weapon systems, where the lives of US military personal were at stake. Despite this, software quality was poor and cost overruns frequent. The amount of money spent on software was increasing exponentially, with no improvement in system quality. In response to this ugly trend, the DoD funded the development of the Ada programming language. Ada was supposed to encourage the development of reliable software and support software reuse. Ada was (and may still be) mandated for all new DoD funded software, especially embedded software systems.

The DoD hoped that their wonderful new language would be widely embraced by industry. Unfortunately, like so many creations of the United States military industrial complex, the Ada programming language was large, slow and difficult to use. Most Ada users were forced to use Ada by their DoD development contracts. The rest of the world increasingly used C and UNIX. Ada also heavily influenced the design of VHDL, a VLSI hardware design and simulation language. Like Ada, the development of VHDL was also funded by the DoD via the Defence Advanced Research Projects Agency (DARPA).

Ada and the more elegant Modula-2 supported software reuse through modules. Modules are an extension of function libraries. A module defined a set procedures and functions which were exported by the module. A module was defined by its interface which could have a variety of separately compiled implementations. This allowed the software that used to module to reference the interface. The module implementation could be changed behind the curtain of the interface without causing any change in the software that used the module.

Outside of Ada, which was largely forced on its user community, the module based languages were not widely adopted. The exception is VHDL, which is used by about 40% of the VLSI design groups. Although the module based languages were designed to encourage software reuse, there are no significant software reuse success stories. The reason for this may be that these languages have been eclipsed by C and C++, which was in the early design stages about the time Modula-2 gained some popularity.

The C++ statically typed object model heavily influenced Java. One view of Java is that Java is C++ designed correctly. The object model is a more powerful tool than the module for software reuse. As a result, software reuse has been more successful in C++ as well. Commercial object packages like the Rogue Wave Software class library have gained some acceptance. The C++ Standard Template Library is also starting to gain wide spread use. On the Microsoft Windows platforms the Microsoft Foundation Classes are widely used to create GUI software. Although software reuse is higher in C++ than in the module based languages, there has been no software reuse success in C++ on the scale delivered by UNIX and C.

What Accounts for Java's Advantage in Software Development

Many people have commented that writing software in Java allows them to be much more productive than in C++. Why is this? Is software development in Java faster because "Java changes everything"? According to the adds Sun has been running in magazines like The New Yorker and Vanity Fair, Java provides an amazing new paradigm, used by the Uber Programmer, that drastically cuts the cost of software development. Is this true? To understand the answer to these questions we must examine Java, its runtime environment and its class library and leave the hype behind.

The Java Programming Language

The Java programming language is defined in The Java Language Specification by Gosling, Joy and Steele. This book defines the Java language syntax and semantics. It also defines a class library. While support for some functions in the class library may have an effect on the way the language is implemented, from the point of view of a compiler developer, the class library is not formally part of the language.

The Java programming language is not radically new. Java has evolved from existing languages and has been heavily influenced by the C++ object model. The designers of Java learned from C++ and implemented a language that, while still large, is smaller and cleaner than C++. Java also supported memory management via garbage collection. There is over thirty years of work on garbage collection algorithms, so this is not new either. But Java is the first widely used language to include garbage collection and to take advantage of a consensus that has started to build about the problems of explicit memory management. Java does not yet include the ability to create generic objects, as can be done with templates in C++, so in some ways Java is also more limited than C++. In balance, Java is better than C++ in a number of ways. The advantages of the Java language are not great enough to account for the increase in productivity that some programmers report.

The Java Virtual Machine

Java runs on a software machine implemented by the Java Virtual Machine interpreter. Since any computer system, with any processor or operating system may support the Java Virtual Machine, in theory a Java program compiled into Java byte code will run the same way on all platforms. This is the basis of Sun's "compile once, run anywhere" claim for Java. The JVM provides some real advantages in a distributed environment. It also makes porting Java much easier. But it does not entirely remove incompatibility. The Java language and class libraries are very complex. No two groups will implement them the same way. There may be differences when a Java program is compiled with, say, IBM's Visual Age Java and Sun's Java compiler (javac). Differences in implementation are a major problem in porting C++ applications to different computer systems, for example. Here again Java provides some advantage, but it does not account for the claims made for Java.

The Java Runtime and Class Library

Java has a huge standard class library. The functions supported by this class library range from simple String support to networking and graphic user interface creation. Java is a reasonably large programming language, but its complexity is easily equaled by the class library. Sun licenses the source for the class library, so any vendor that agrees to Sun's terms can support the class library on their Java platform. As a result, the Java class library is both large and ubiquitous, creating a standard software environment on all Java platforms. The tsunami of popularity that Java has enjoyed has resulted in a huge body of literature on all aspects of the Java class library. This reduces the learning curve and makes the class library easier to use.

The advantage yielded by Java in implementing some applications is not a result of the language or the virtual machine, but is due to the class library. By reusing functions from Java's huge class library, software can be implemented faster and with fewer errors, since the class library has been debugged through wide use. This makes Java one of the most significant instances of software reuse since UNIX and C.

Microsoft's Foundation Classes also provide a large library of reusable code on the Windows platforms. The Microsoft development environment is far more powerful than anything provided by Sun, so GUI development with Microsoft's Visual C++ can be as fast or faster than similar development in Java. But the Microsoft Foundation Classes do not provide a standard cross platform class library. Companies like Mainsoft have implemented versions of the Microsoft Foundation Classes on UNIX, but there is always a concern about compatibility, since cross platform compatibility does not seem to be high on Microsoft's list of priorities. The abstraction provided by the Java class library is cleaner, which makes them easier to understand and use.

Software development is not always faster in Java. Java yields a significant advantage only when the class library can be leveraged. The first large application I developed in Java was a class file disassembler, javad. I did not find developing this program in Java to be significantly faster than development would have been in C++. The javad program is similar to a parser and the Java class library provides no advantage for this kind of application.

An early version of this Web page appeared in a note I wrote in the Java Developers Journal Java Forum. This note discusses Rich Hightower's article Programming Languages for the JVM (the links to my note and Jim Hightower's article seem to have disappeared).

Ian Kaplan, February 13, 2000

back to (BPI) table of contents