A Partial Java Front End To do List


For a language as large as Java, a grammar of a managable size will not flag all illegal Java objects with an error. This is a partial list of checks that must be performed after parsing Java. The Java Language Specification (JLS) contains a detailed discussion of Java semantics. This list concentrates largely on syntactic issues. There is some overlap, however. Also, the material published here is simply a snapshot. This list is maintained off line. I have published it here as an example for the page on the Java front end.

  1. Initialization assignment

    The class below shows two examples of simple initialization assignment.

    class foobar {
      int x = 42;
      int a[] = {1, 2, 3, 4};
    }
    

    Initialization may include complex expressions:

    class foobar {
      int x = 42;
      int y = x * 2;  // allowed because x preceeds the expression
    }
    

    or

    
    import java.lang.Math;
    
    class zorch {
      double r = Math.random();
    }
    
    

    Semantic checking must be done on the initialization expression, because any legal expression is syntactically correct. For example

    class foobar {
      int y = y + 2;   // compile time error
    }
    

    is syntactically correct but semanticly illegal, since y is a forward reference to itself and has no initial value. The op-equal form is also sematicly illegal but syntactically allowed.

    class foobar {
      int y += 2;   // compile time error
    }
    

    Similarly,

  2. class foobar {
      int z = (z = 0) + 2;   // compile time error
    }
    

    is illegal because it contains an invalid forward reference to z. JLS 8.3.2.1 states:

    A compile_time error occurs if an initialization expression for a class variable contains a use by a simple name of that class variable or of another class variable whose declaration occurs to its right (this is, textually later) in the same class.

    The class below should also be flagged with a compiler time error.

    
    class foobar {
      int j = i * 2;  // invalid forward reference to member "i"
      int i = 42;
    }
    
    

    But the class below, with a static declaration for "i" is allowed

    
    class foobar {
      int j = i * 2;      // forward reference allowed because "i" is static
      static int i = 42;
    }
    
    

    So the To Do is to add semantic checking for variable initialization. This will be done on the ASTs.

  3. Array initializers

    The size of an array in a new statement may not be defined if there is an array initializer. For example:

           int b[] = new int[5] { 5, 6, 7, 8, 9}; -- illegal
    

    This much be check in semantic analysis since this is difficult to check in the parser.

    An array may also illegally contain a non-array initializer, as below. The must be caught in semantic analysis as well.

           int b[][] = new int[][] { 4, { 5, 6, 7, 8, 9}};
    

    The semantic type rule is that all children in an array initializer must be of the same type (in this case int[]). Here one is an integer literal and on is a int[].

    Obviously the initializer must also match the new type (int[][] in the above example).

  4. More initialization (static initializers and instance initializers)

    The variables initialized by a static initializer must be static. The variables initialized by an instance initializer may be either static (in which case the static variables get a new value) or instance variables.

  5. Out of range floating point values

    Java requires that compile time checking be done on floating point literal strings to make sure that their value would be in range. This is done for 32-bit float values but is not done for double. Fix the parser so that it catches this error.

  6. Validation of the LHS of an assignment.

    Both the ANTLR grammar and the Java Language Spec grammars allow illegal items on the left hand side. For example, the ANTLR Java syntax allows a statement like

             ((val) ? x : y) += z
    

    which is syntactically incorrect in Java (or pretty much any other language). Similar problems exist in the LALR(1) grammar published in the JLS. For example:

       JLS 19.12
    
         Assignment
            LeftHandSide AssignmentOperator AssignmentExpression
    
         LeftHandSide
            Name
            FieldAccess
            ArrayAccess
    
         ArrayAccess:
            Name [ Expression ]
            PrimaryNoNewArray [ Expression ]
    
         PrimaryNoNewArray:
            Literal
            this
            ( Expression )
            ClassInstanceCreationExpression
            FieldAccess
            MethodInvocation
            ArrayAccess
    
         ClassInstanceCreationExpression:
            new ClassType '(' ( ArgumentList )? ')'
    

    The JLS grammar allows a statement like

            64[ i ] = 42
    

    which is clearly wrong. The JLS grammar also allows assignments like

          foobar zorch = new foobar();
          zorch.x = 1;
          zorch.y = 2;
          zorch.z = 3;
    
          new foobar() = zorch;  -- illegal assignment
    

    which are also incorrect.

    The semantic pass must validate the assignment LHS to assure that only valid syntatic items appear on the LHS.

  7. Check instanceof operands.

    The instanceof operator may only be applied to object operands (e.g., on the LHS an object variable and on the RHS an object type). Otherwise a compile error must be issued.

  8. Check the semantics of the .class operator

    The .class operator must be the last word in a "dot" string (e.g., x.class, not class.y). Further, the .class operator can only be applied to a type.

  9. Check semantics of the array object declaration in "new"

    If the array has an initializer, it should not have dimension size defintion expressions. For example, this is legal

        int a[] = new int[] = { 1, 2, 3, 4, 5, 6 }; // legal
    

    But this is illegal

        int a[] = new int[6] = { 1, 2, 3, 4, 5, 6 }; // illegal
    
  10. Constructor semantics

    A constructor is a method that has no type specification. Check that methods of this type have the same name as the enclosing class.

  11. Constant Expression in switch case

    The expression associated with a case is a ConstantExpression. However, when Java is parsed it can't be determined whether the expression is constant or not. Before semantic analysis, fold constant expressions. If the case expression is not constant after folding report an error. The rules for ConstantExpression are:

    J.L.S 15.27 Constant Expression

    A compile-time constant expression is an expression denoting a value of primitive type or a String that is composed using only the following:

  12. For loop condition must be boolean

    As far as the parser is concerned, the for loop condition is simply an expression. However, in Java this expression must evaluate to a boolean value. This must be checked in the semantic phase.