Our specifications and implementations of value-of
specify and implement the dynamic semantics of
expressions.
Expressions can also have static semantics, which concern the properties of expressions that can be deduced without executing the expressions.
Type safety is an important property of expressions. Whether type safety is static or dynamic depends upon the programming language. If type safety is a static property, then we say the language is strongly typed.
Some programming languages allow the type of an expression to be calculated without executing the expression. We say these languages are statically typed.
A language can be statically typed without being strongly typed. C and C++, for example, are statically typed but not strongly typed, because type safety is not a static property of those languages. The problem is that the C/C++ type system is unsound. Although the type of a C/C++ expression can be calculated statically, that type is not always a reliable prediction of the expression's value at run time.
For example:
#include <stdio.h> #include <stdlib.h> double f (double * p) { *p = 3.14159; return *p; } int main (int argc, char* argv[]) { int n = 12345; double * p = (void *) &n; double x = f(p); double y = f(p+1); double z = f(p+2); printf ("n = %d\n", n); printf ("x = %lf\n", x); }
What output is printed by that program?
On one machine, the output was:
n = 1074340345 x = 3.141590
On another machine, the program terminated with a segmentation fault. The problem is that the program is not type-safe.
(diff-exp exp1
exp2)
,
the values of exp1
and exp2
are both numbers.
(zero?-exp exp1)
,
the value of exp1
is a number.
(if-exp exp1
exp2
exp3)
,
the value of exp1
is a boolean.
(call-exp exp1
exp2)
,
the value of exp1
is a procedure.
If one of those conditions is violated, we call it a type error. (Hence not all errors are type errors.) We say that a LET, PROC, or LETREC program is type-safe if and only if its execution cannot possibly involve a type error.
Not all LET, PROC, and LETREC programs are type-safe. That leads to the following question:
Is type safety a static property of LET, PROC, or LETREC?
That's the same as asking whether LET, PROC, and LETREC are strongly typed.
It so happens that LET is strongly typed.
That is not terribly interesting, however,
because LET is not a very expressive language.
For example, there is no LET expression
exp
such that, for all integer values n
,
let x = n in exp
evaluates to the absolute value of n
.
For another example, it is not possible to write an infinite loop
in the LET language.
The interesting question is whether PROC is strongly typed.
If PROC were strongly typed, then type safety would be a static property of PROC programs. In other words, there would be some algorithm that takes an arbitrary PROC program as input and decides whether the program is type-safe. In particular, that algorithm would be able to decide whether an arbitrary program of the form
if <expression> then (0 0) else (0 0)
is type-safe.
It should be obvious that programs of that form are type-safe
if and only if the <expression>
does not halt.
If PROC were strongly typed, therefore, then there would
be some algorithm that takes an arbitrary expression
as input and decides whether the expression halts.
Theorem. For all Turing-complete programming languages, the halting problem is undecidable.
PROC (unlike LET) is Turing-complete. Because the halting problem is undecidable, no algorithm is able to decide whether an arbitrary PROC program is type-safe. In other words, PROC is not strongly typed. Since PROC is a proper subset of the LETREC language, LETREC is not strongly typed either.
The undecidability of the halting problem tells us that no general purpose programming language can be strongly typed, assuming type safety and strong typing are defined as above.
That's not the answer we want.
We can't have the answer we want.
We can, however, change the definition of type safety and/or strong typing so we can pretend to have the answer we want. The standard way to do that is:
That last step means we redefine strongly typed to mean
We'll start by defining a static type system for PROC:
(type-of (const-exp num) tenv)
=int
(type-of (var-exp var) tenv)
=tenv
(var
)
(type-of exp1 tenv)
=int
(type-of exp2 tenv)
=int
--------------------------------------------------------------------
(type-of (diff-exp exp1 exp2) tenv)
=int
(type-of exp1 tenv)
=int
----------------------------------------------------------------
(type-of (zero?-exp exp1) tenv)
=bool
(type-of exp1 tenv)
=bool
(type-of exp2 tenv)
=t
(type-of exp3 tenv)
=t
--------------------------------------------------------------------
(type-of (if-exp exp1 exp2 exp3) tenv)
=t
(type-of exp1 tenv)
=t1
(type-of body [var1=t1]tenv)
=t
------------------------------------------------------------------------
(type-of (let-exp var1 exp1 body) tenv)
=t
(type-of body [var1=t1]tenv)
=t2
----------------------------------------------------------------------------
(type-of (proc-exp var1 body) tenv)
=(t1 → t2)
(type-of exp1 tenv)
=(t1 → t2)
(type-of exp2 tenv)
=t1
--------------------------------------------------------------------
(type-of (call-exp exp1 exp2) tenv)
=t2
Definition. A PROC program(a-program exp)
is well-typed if and only if there exists some typet
such that the typing rules for PROC can be used to prove(type-of exp tenv0) = t
where tenv0 = [i:int,v:int,x:int]
is the initial type environment that specifies the types of
all variables bound in the standard initial environment.
The next step is to prove
Theorem. IfP
is a well-typed PROC program, thenP
is type-safe.
That theorem is proved by induction on the number of calls to
value-of
that occur during the evaluation of
P
.
The next step is to prove that well-typedness is statically decidable.
The usual way to prove the decidability of some problem is to describe an algorithm that decides the problem. Such an algorithm is said to be a decision procedure.
It is easy to describe a decision procedure for determining whether a LET program is well-typed:
Algorithm. Given a LET program(a-program exp)
, use the following algorithm to decide whetherexp
is well-typed with respect to the initial type environmenttenv0
.
Ifexp
is a constant expression, thenexp
is well-typed with respect totenv
.
Ifexp
is a variablex
, thenexp
is well-typed with respect totenv
if and only ifx
is bound in the type environmenttenv
.
Ifexp
is of the form(diff-exp exp1 exp2)
, thenexp
is well-typed with respect totenv
if and only if bothexp1
andexp2
are well-typed in the type environmenttenv
and are of typeint
.
Ifexp
is of the form(zero?-exp exp1)
, thenexp
is well-typed with respect totenv
if and only ifexp1
is well-typed in the type environmenttenv
and is of typeint
.
Ifexp
is of the form(if-exp exp1 exp2 exp3)
, thenexp
is well-typed with respect totenv
if and only ifexp1
,exp2
, andexp3
are well-typed in the type environmenttenv
,exp1
is of typebool
, andexp2
andexp3
are of the same type.
Ifexp
is of the form(let-exp var1 exp1 body)
, thenexp
is well-typed with respect totenv
if and only ifexp1
is well-typed in the type environmenttenv
andbody1
is well-typed in the type environment[var1:t1]tenv
, wheret1
is the type ofexp1
.
That decision procedure is just the obvious algorithm that uses the typing rules for the LET language to compute the type of an expression.
If we try to extend that algorithm to proc
expressions, we run into a problem:
(type-of body [var1=t1]tenv)
=t2
----------------------------------------------------------------------------
(type-of (proc-exp var1 body) tenv)
=(t1 → t2)
With the other rules, every type that occurs in the
hypotheses of the rule is either a fixed type
(such as int
) or is the type of some
subexpression.
With proc
expressions, however, it looks
like we'd have to guess the type of the bound variable.
There are two standard ways to deal with this problem:
Each of these approaches has its own advantages and disadvantages:
Historically, most programming languages have been designed by the same people who implement them, so most programming languages place the burden on users instead of implementors. Although that is a fairly trivial basis for making such an important design decision, it really does seem to have been the most influential factor in most programming languages.
Last updated 22 March 2008.