Compiler Front End and Infrastructure Software

Compiler infrastructure software usually includes one or more language front ends, software for abstract syntax tree (AST) generation and AST modification. Optimization and code generation phases may also be included. These software suites are specific for compiler creation and in this domain are more powerful than parser generators, as long as the user is willing to live with the pre-built infrastructure. This software is also less general than parser generators which can be used for a range of language processing tasks (e.g., HTML or XML parsing, for example).

Although compiler design and development seems to have fallen out of favor as a research topic, it has a long and rich history. A great deal of work has been done on compiler infrastructure software. It is a gross understatement to say that this is not a complete bibliography. The OSUIF paper, referenced below, includes a more complete list of infrastructure tools.

Obviously these tool suites are not limited to or specific for Java. However, when considering the architecture for a Java compiler it is reasonable to consider whether some of this software can be used to simply the huge development task.

See Why Use ANTLR for a related Web page that discusses the ANTLR parser generator and lists other parser generator options (including tools like state machine generators and tree walker generators).

There is no free lunch

Software reuse has been one of the mantras of software development for the last twenty years (remember Ada?) The concept is both appealing and impossible to argue with: by reusing tested, carefully designed software, more reliable software can be created in less time. In practice software reuse has proven difficult and there are few wide spread successes. In my opinion there are, in fact, two: UNIX (or POSIX) and Java. The C++ Standard Template Library (STL) may be a more modest third.

There are two barriers to software reuse: design and complexity. Creating reusable software requires a deep understanding of object creation and reuse. This is a rare skill and one that software engineers spend their careers developing. Many object packages that are created for reuse are difficult to use because of design flaws. In addition to design issues, application specific packages, like compiler infrastructures (I know you were wondering where I was going with this), are difficult to use because they have a huge learning curve. These infrastructures consist of tens of thousands of lines of code. They can be rapidly reused by their creators, but reuse is more difficult in the hands of someone else.

A compiler infrastructure could be created entirely out of objects which could be plugged in and out to create new compiler software. Examples of these objects include memory allocation, parsers, flow graph creation and optimization. However, it is difficult to create an efficient set of compiler objects that are truly interchangeable. For example, if the AST changes, will flow graph creation still work? As a result, many infrastructures are interdependent and it is difficult to reuse one piece without them all.

Finally, many of these infrastructures are aimed at research applications and trade flexibility for performance and size. This makes them impractical for production quality compilers.

Compiler Infrastructures

Commercial Compiler Front End Products

Compiler front ends, consisting of at least a parser and in many cases symbol table and semantic analysis support are a lot of work to implement (the EDG front end, below, consists of almost 300K lines of source code and comments).

Over time there have been a number of companies that offer front end software. One of the most widely used is the Edison Design Group's C++ front end. This front end reads C++ and outputs ANSI C.

The Edison Design Group (EDG) has been in business for quite a while. However, many front end companies have come and gone. For example, Compass Design Automation sold a VHDL front end that has been used by several EDA companies. Compass was purchased by Avant!, which as far as I know has stopped selling the front end. Another VHDL front end was available from Leda (a French software company). However Leda has been purchased by Synopsys. Synopsys no longer seems to be selling the Leda VHDL front end (this is not a surprise, since Synopsys would be supplying potential competitors with an important component that could be used to build tools that compete with those sold by Synopsys.

Listed below is a partial list of front end companies (other than EDG, mentioned above):

Compiler creation software

The ANTLR parser generator has been discussed in detail elsewhere on this site. Parser generators are relatively general tools, since they can be used to build parsers for languages like HTML and scripting languages. Compiler creation software is specialized software for translating a language into native code. In addition to the parser generator, compiler creation software will usually include a standard intermediate that is shared by the language front ends, various optimization passes and code generators. At one time MetaWare sold compiler creation software. They don't seem to do this currently. Compass, a Boston area compiler and consulting company that worked on compilers for Thinking Machines and MasPar used to sell compiler creation software, but they are no longer in business. So the list included below is rather short.

Ian Kaplan, May 16, 2000
Revised: May 2007

back to Java page