A Type-safe Enumeration Class in Java (and C++)

Java has inherited much of the syntax of C++, which in turn inherited much of the syntax and semantics of C (in fact, in the early versions of C++, it was designed to be a superset of C's syntax). The common syntax of these languages means that programmers will sometimes use idioms that they are familiar from C++ in Java. In some cases Java offers better idioms. One of these involves enumerations.

Enumerations allow a set of named values to be created. Before enumerations the only way to do this in C was using #define. For example:

#define APPLE 1
#define PEAR  2
#define PEACH 3
#define PLUM  4

The C preprocessor (cpp) processes all C and C++ files before the compiler is called. The C preprocessor will fill in the defined value for every occurance of the define name. So where ever PEAR appears in the code, the C preprocessor will fill in 2.

While #define constants are better than just using unnamed numeric values, this is a crude way to define a set of values. Later versions of C added enumerations, which are also in C++. An example of a C++ enumeration, defining the same values is shown below:

typedef enum { APPLE = 1, PEAR, PEACH, PLUM } fruit;

In C++ enumerations also got a thin veneer of type safety. To convert an enumeration to an integer it was necessary to use a cast.

  fruit basket = (fruit)42;

The range for enumeration values is not enforced in C++, so the statement above compiles without error, even though 42 is beyond the enumeration range. In C++ enumeration values can also be assigned to integers without a cast operation:

  int y = PLUM;

Java does not provide an enumeration type, so it is tempting to use something like the C #define. For example:

class StateMachine
{
    public static final int WAIT    = 1;
    public static final int NICKLE  = 2;
    public static final int DIME    = 3;
    public static final int QUARTER = 4;
    ....
    private int currentState = WAIT;
}

Typing in languages helps catch implementation errors. In the example above, the final values are simply integers. There is nothing to stop us from assigning currentState the value 42.

In his excellent book Effective Java Programming Language Guide, Joshua Bloch shows how a type safe enumeration can be used in Java to define a set of named values (see Item 21: Replace enum constructs with classes). For example:

public class MachineStates
{
    private static int enumCount = 1;
    private int enumVal;
    private String name;

    private MachineStates( String str )
    {
      name = str;
      enumVal = enumCount;
      enumCount++;
    }
    
    public String toString() { return name; }
    public int toInt() { return enumVal; }

    public static final MachineStates WAIT = new MachineStates("WAIT");
    public static final MachineStates NICKLE = new MachineStates("NICKLE");
    public static final MachineStates DIME = new MachineStates("DIME");
    public static final MachineStates QUARTER = new MachineStates("QUARTER");
}

Now if a currentState object of type MachineStates is created, it can only take on the values {WAIT, NICKLE, DIME, QUARTER}. An out of range value will be caught at compile time.

Type safe enumeration classes are increasingly used in Java. For example, the Apache Jakarta project Log4J logging package labels message log messages using the Level type safe enumeration object, which defines Level object values Level.INFO, Level.DEBUG, Level.ERROR, Level.FATAL, and Level.WARN.

Depending on the application additional features can be added to the type safe enumeration. So it would be nice to have a base class that packages the functionality of the type safe enumeration. Then the base class can be used to define a variety of type safe enumeration sub-classes:

class ToolEnum extends TypeSafeEnum
{
    private ToolEnum(String name)
    {
      super( name, ToolEnum.class );
    }

    public static ToolEnum Hammer = new ToolEnum("Hammer");
    public static ToolEnum Saw = new ToolEnum("Saw");
    public static ToolEnum ScrewDriver = new ToolEnum("ScrewDriver");
}

and

class FruitEnum extends TypeSafeEnum
{
    private FruitEnum(String name)
    {
      super( name, FruitEnum.class );
    }
    public static FruitEnum Apple = new FruitEnum("Apple");
    public static FruitEnum Pear = new FruitEnum("Pear");
    public static FruitEnum Orange = new FruitEnum("Orange");
}

The TypeSafeEnum base class can be found on this web page or down loaded here.

An Attempt at Type Safe Enumerations in C++

I am currently developing version three of a spam filter designed to run on a UNIX shell. Java would be a poor choice for this application, primarily because I need good performance so email processing uses as little processing time as possible. For many applications Java is 5 to 10 times slower than C++.

The structure of the C++ programming language is similar to that of Java and some code translate easily from one language to the other. The type safe enumeration is an improvement over the enum type in C++, so I tried to implement it in C++. This section discusses this unsuccessful effort. This is a case where Java really shines and C++ is not as powerful by comparision.

As with the Java example, I've used a base class to define the core type safe enumeration class. The C++ version is now as powerful, however and does not support multiple enumeration types defined on the base class. The class definition for the base class is shown below:


TypeSafeEnum.h

#ifndef TYPESAFEENUM_H
#define TYPESAFEENUM_H

/**
   A "type safe" enum inspired by the "type safe" enum in Java.

   This enum is considerably more awkward that the similar class
   in Java, however.
 */
class TypeSafeEnum
{
 private:
  static size_t enum_count;
  const char *enumName;
  size_t val;

 public:
  TypeSafeEnum() : val(0), enumName(0) {}
  TypeSafeEnum( const char *name ) : val(enum_count)
  {
    enumName = name;
    enum_count++;
  }

  const static TypeSafeEnum null;
  TypeSafeEnum(const TypeSafeEnum &rhs) : val(rhs.val)
  {
    enumName = rhs.enumName;
  }

  size_t getVal() const { return val; }
  const char *getName() const { return enumName; }
  static size_t getMaxEnum() { return enum_count; }


  /**
     This uses a pointer compare
  */
  bool operator ==(const TypeSafeEnum &rhs ) const
  {
    return (enumName == rhs.enumName);
  }

  bool operator !=(const TypeSafeEnum &rhs ) const
  {
    return (enumName != rhs.enumName);
  }
}; // TypeSafeEnum

#endif

Unfortunately there is no way to define static initializers in the class itself in C++ (perhaps a later version will fix this problem). So the count variable (enum_count) must be defined in a .C file (or .cpp):


TypeSafeEnum.C

#include "TypeSafeEnum.h"

size_t TypeSafeEnum::enum_count = 0;

const TypeSafeEnum TypeSafeEnum::null("null");

The SpamEnum class is derived from the TypeSafeEnum class. When I was developing this class I was thinking about defining an enumeration value to represent the various parts of an email (e.g., to, from, subject, etc...) The SpamEnum class is shown below:


SpamEnum.h

#ifndef SPAMENUM_H
#define SPAMENUM_H

#include "TypeSafeEnum.h"

/**
   A "type safe" enum inspired by the "type safe" enum in Java.

 */
class SpamEnum : public TypeSafeEnum
{
 private:
  class TableElem {
  public:
    const char *name;
    TypeSafeEnum enumObj;
    TableElem(const char *n, const TypeSafeEnum e) 
    {
      name = n;
      enumObj = e;
    }
  };

  static const TableElem enumTable[];

 public:
  // SpamEnum()  {}
  SpamEnum(const char *name) : TypeSafeEnum( name ) {}

  static TypeSafeEnum findEnum( const char *name );

  static const SpamEnum toAddresses;
  static const SpamEnum fromAddresses;
  static const SpamEnum killWords;
  static const SpamEnum spamWords;
  
}; // SpamEnum

#endif

Again there is the problem that the class definition cannot have static initializers, as Java classes can. So a .C (or .cpp) file must define the static initilizers for the enumeration objects (i.e., toAddress).


SpamEnum.C

#include 

#include "SpamEnum.h"

const SpamEnum SpamEnum::toAddresses("to_addresses");
const SpamEnum SpamEnum::fromAddresses("from_addresses");
const SpamEnum SpamEnum::killWords("kill_words");
const SpamEnum SpamEnum::spamWords("spam_words");

const SpamEnum::TableElem SpamEnum::enumTable[] = { 
  SpamEnum::TableElem(SpamEnum.toAddresses.getName(),SpamEnum.toAddresses),
  SpamEnum::TableElem(SpamEnum.fromAddresses.getName(),SpamEnum.fromAddresses),
  SpamEnum::TableElem(SpamEnum.killWords.getName(), SpamEnum.killWords),
  SpamEnum::TableElem(SpamEnum.spamWords.getName(), SpamEnum.spamWords),
  SpamEnum::TableElem(0, TypeSafeEnum.null) };

TypeSafeEnum SpamEnum::findEnum( const char *name )
{
  TypeSafeEnum enumObj = TypeSafeEnum.null;
  size_t ix = 0;
  while (enumTable[ix].name != 0) {
    if (strcmp(name, enumTable[ix].name) == 0) {
      enumObj = enumTable[ix].enumObj;
      break;
    }
    else {
      ix++;
    }
  }
  return enumObj;
}

Compared to Java this code is awkward. The C++ code is larger and the nature of C++ forces it to be spread through four files. The C++ enumeration is not as easy to use as the Java version either. The static class variables can be referenced via the type name (i.e., SpamEnum.toAddress). Reference to the static methods is more awkward. In C++ static methods can only be referenced from a class instance. In the code below a temporary object is created via the constructor and the static method is referenced from this temporary (SpamEnum().getMaxEnum()).


#include <stdio.h>

#include "SpamEnum.h"

main()
{
  const SpamEnum x = SpamEnum.toAddresses;

  if (x == SpamEnum.toAddresses)
    printf("It worked\n");

  if (x != SpamEnum.fromAddresses)
    printf("It still works\n");

  printf("maxEnum = %d\n", SpamEnum().getMaxEnum() );

  printf("SpamEnum.%s value is %d\n", 
	 SpamEnum.toAddresses.getName(), 
	 SpamEnum.toAddresses.getVal() );
  printf("SpamEnum.%s value is %d\n", 
	 SpamEnum.fromAddresses.getName(), 
	 SpamEnum.fromAddresses.getVal() );

  printf("SpamEnum.%s value is %d\n", 
	 SpamEnum.killWords.getName(), 
	 SpamEnum.killWords.getVal() );

  printf("SpamEnum.%s value is %d\n", 
	 SpamEnum.spamWords.getName(), 
	 SpamEnum.spamWords.getVal() );

  const char *words[] = { "to_addresses", "from_addresses", "kill_words", "spam_words", 0};

  const char **ptr = words;
  while (*ptr != 0) {
    TypeSafeEnum enumObj = SpamEnum().findEnum( *ptr );
    printf("%s = %d\n", *ptr, enumObj.getVal());
    ptr++;
  }
}

I regard my attempt at a type safe enum in C++ as a failure. The complexity of my C++ implementation of the type safe enumeration is unwieldy. Although I lose the advantages of strong typing for enumerations, it is far simpler to use the standard enum construct in C++.

Ian Kaplan, October, 2003
Updated:


back to Miscellaneous Java Related Topics