Section notes, week 7: Types

Proofs, inference Rules, and judgments

Our goal is to check whether a program has type errors. If it has no type errors, we want to prove it.

(Note 1. (Optional.) In general, a proof system is sound if all the statements we can derive with it really are true. That means the proof system never leads us astray by deriving false statements. The definition of soundness above is exactly this, and it also explains the meaning of the true statement.

Note that the meaning of the judgment has two parts: progress (evaluation proceeds all the way to a value without type errors), and preservation (the resulting value has the same type as the expression). Progress tells us that there are no type errors. Preservation tells us that our expression really does evaluate to type T, so we were correct to use T in further type checking.

Note also that in the definition of soundness we are saying that if we can derive using inference rules a judgment that expression e has type T, then e evaluates to a value v that is a member of the set T. The importance of soundness is that it links the two views of types, derivations and sets.)

(Note 2. (Also optional.) There is also a property called completeness. A type system is complete if we can derive any true statement with it. Equivalently, this means there is no true statement that we can't derive. A type system that is both sound and complete would be able to derive all true statements and only true statements.

Computability theory tells us that we can't have a sound and complete type system. Given that we can't have both, we go for soundness. (Why?) In an incomplete type system, there are programs that have no type errors but that don't type check. (Why?) It's one of the costs of type checking.)

Functional language type derivations


Imperative language (Decaf) type checking

It's not as convenient to use inference rules to express the type checking for Decaf, although we could do it. The rules are harder to write down because the structure of the code does not mirror the structure of the typing as clearly in an imperative language. For example, variables are typically declared using statements in a block, rather than in a let expression with the variable scope being the child of the let node.

In an imperative language like Decaf, we do it like this:

  1. Do name analysis with a symbol table. This finds name errors and determines the type of every name instance.
  2. Compute the type of every AST node with a bottom-up traversal. The bottom-up traversal uses "rules" that are similar to, but not quite the same as, the functional inference rules.

Question: What do we do when type checking fails for a node? For example, what if there is an "division" AST node with an integer left child and a string right child? Well, first we record an error message. We'd like to be able to keep checking so we can find more errors on this compiler run. We'd also like to avoid printing 500 more error messages that were just caused by this one.

We can assign a special type "error" to the AST node. Then, we design our bottom-up traversal so that any node with a child of type "error" just gets type "error" with no further messages printed.

There are other things that have to be dealt with in real languages. For example, what type do you think Java assigns to a+b in this code:


    byte a = 45;

    short b = -99;

    System.out.println(a+b);

The answer is int. In fact, Java converts, or "promotes", a and b to ints before doing the addition. This is because most architectures have instructions for adding 32-bit signed integers, but they don't necessarily have instructions for dealing with shorter values. Also, that's the way C does it. Note that the compiler actually adds nodes to the AST to represent the conversions during type checking.