This is a brief and very informal presentation of the course.

Type theory is a discipline which straddles both mathematics and computing science. In mathematics, type theory has strong roots within mathematical logic, while recent research (since 2009) found important links with geometry. In computing science, type theory had (and still has!) a huge influence on the design of programming languages.

This course mainly focuses on the theoretical / mathematical aspects
of types, especially in functional programming. Some practical
activity is also planned. The only *strong* prerequisite is
a general mathematical mindset (abstract reasoning skills). See
ESSE3 for the complete syllabus.

To start, we can ask ourselves if \( 5 = \langle 1,2 \rangle \) holds. The question may appear strange at first, and without a more precise context, it could be answered in two ways.

One might answer that such
equality does not hold: \(5\) is a number, and not a pair. Indeed,
according to the
de
facto standard set theory axioms, it does not hold. However, using
the same set theory, we might find out that the equation \( 5 = \{
0,1,2,3,4 \} \) instead *does* hold, since \(5\) is defined in
precisely that way. So, \(5\) is not a pair but it is a set, which
might be as puzzling. In practice, one should regard this fact as an
incident, an artifact of the "low-level" definition of \(5\).
As such, after having proved the fundamental properties of naturals,
mathematical proofs should avoid exploiting such low-level artifacts,
relying instead on the higher-level algebraic structure of \(\mathbb{N}\).

There is however another answer to \( 5 = \langle 1,2 \rangle \). One
could say that this equation is not just false, it is meaningless. We
could in fact restrict ourselves to study only those equations of the
form \( x = y \) where \(x,y\) are objects of the same "nature".
Precisely formalizing the requirement is not trivial. One might be
tempted to require that \(x,y \in A \) for the same set \(A\), but
unfortunately set theory allows us to construct the set \( A =
\mathbb{N} \cup \mathbb{N}^2 \), which contains both numbers and
pairs. We want a different notion: we want to require that \(x,y\) have
the same *type*.

In programming languages, we sometimes face similar issues. Consider the following (quite contrived) pseudo-code:

1) a,b = factorize(93274875782361); 2) print("6" + 7); 3) print(a);

In line 1, assume that factorization takes a long time to execute, producing two non trivial factors when possible. In line 3, one of such factor is printed. The real question is: what happens in line 2?

Line 2 adds a string and a number. Real world programming languages
widely depart on the semantics of `"6" + 7`

.
Below, we discuss the behaviour of several languages.

Some of them (Pyhton, Ruby) will execute line 1, and
*after* the long computation, report a type error in line
2. Here, the program is aborted before the program prints the result
in line 3.
Others (Java, Scala, C#, JavaScript/ECMAScript) will instead
automatically convert the number `7`

to a string, and
concatenate the strings, printing `67`

in line 2.
Others (Perl5, PHP) will instead automatically convert the
string `"6"`

to a number, and add the numbers,
printing `13`

in line 2.
Others (C, C++) will instead automatically convert the
string `"6"`

to a pointer-to-char, add 7 to the pointer.
Roughly, this would have had the effect of discarding the first 7
characters of the string, would it have been longer. Since this can
not be done, the behaviour of the program
is *undefined*. According to the ISO standards, executing such
a program can do absolutely anything, with no restrictions
whatsoever. In most implementations, the program crashes, or prints
some random characters.
Others (Rust,Pascal,Ocaml,Haskell) instead detect the type
error *before* the program even starts to run. This is done by
exploiting a set of *typing rules* which check the
syntax of the program (without running it). When this check succeeds,
we know that the program will never perform at run-time
any ill-typed operation. A famous slogan here is
"typeable programs can not go wrong".

When types are checked during the program execution, we speak about
*dynamic* type checking. When they are checked before the program
starts, we speak about *static* type checking.
Type theory focuses on the static aspects of types,
including how the typing rules should be designed.
Given a set of such rules, a few important problems arise:
type checking (is a given expression of the given type?),
type inference (find a type for a given expression),
term inference (find an expression for the given type).
These problems may be trivial when the allowed types are trivial,
as in the basic example above (where we only have strings and integers),
but quickly become more complex once we allow more complex types
(see below for a few examples).

A final remark: the focus of the course is *not* the in-depth
study of all the languages above. Indeed, real-world languages have
many, many features and examining them would take a very long
time. Still, when discussing the theory underlying types, we will
mention a few specific type-related features in such languages,
also involving some practice.

Below, we provide some random formula, to hopefully whet the appetite of prospective students. These are not really explained here (that's what the course is for!), but having a peek at them might still be interesting.

\[ \begin{array}{ll} \mathbb{B} & \mbox{booleans} \\ \mathbb{N} & \mbox{natural numbers} \\ \mathbb{S} & \mbox{strings} \\ \ldots \\ 0 & \mbox{empty} \\ 1 & \mbox{unit (has a single element)} \\ A + B & \mbox{sum, or coproduct, or disjoint union}\\ A \times B & \mbox{product, or cartesian product}\\ B^A \quad\mbox{or}\quad (A \rightarrow B) & \mbox{exponential, or function space}\\ \sum_{x:A} B(x) & \mbox{dependent sum} \\ \prod_{x:A} B(x) & \mbox{dependent product} \\ \end{array} \]

\[ \begin{array}{l} \mathbb{B} \simeq 1 + 1 \\ A \times B \simeq B \times A \\ A \times (B + C) \simeq A \times B + A \times C \\ A^B \times A^C \simeq A^{B \times C} \\ A \times A \simeq A^{\mathbb{B}}\\ \sum_{x:A} B \simeq A \times B \\ \prod_{x:A} B \simeq B^A = (A \rightarrow B) \\ \end{array} \]

Read \( \Gamma \vdash e:T \) as "assuming \(\Gamma\), the expression \(e\) has type \(T\)". \[ \begin{array}{ll} \dfrac{ \Gamma \vdash a : A \qquad \Gamma \vdash b : B }{ \Gamma \vdash \langle a, b \rangle : A \times B } & \dfrac{ \Gamma \vdash e : A \times B }{ \Gamma \vdash \pi_1(e) : A } \\[5mm] \dfrac{ \Gamma, x : A \vdash e : B }{ \Gamma \vdash (\lambda x : A.\ e) : A \rightarrow B } & \dfrac{ \Gamma \vdash f : A \rightarrow B \qquad \Gamma \vdash a : A }{ \Gamma \vdash (f\; a) : B } \\ \end{array} \]

Given the type \( A \times B \rightarrow A \), we can construct a program having that type (the program which projects the first component). Given the logical formula \( A \land B \rightarrow A \), we can construct a proof for it (it's a theorem).

Given the type \( A \times B \rightarrow C \), we can NOT construct a program having that type (how to construct a value in \(C\)?). Given the logical formula \( A \land B \rightarrow C \), we can NOT construct a proof for it.

Given the type \( (A \times (A \rightarrow B)) \rightarrow B \), we can construct a program having that type (given \(\langle a,f \rangle\), return \(f(a)\)). Given the logical formula \( (A \land (A \rightarrow B)) \rightarrow B \), we can construct a proof for it (it's a theorem).

This is not a coincidence, it is an isomorphism!

Recursion is a common tool when writing programs. For instance,

factorial n = if n == 0 then 1 else n * factorial (n - 1)

is a recursive factorial program, which follows the mathematical definition. Even if it is less common, recursion can also be used on types. Below, we can recursively define the types of naturals, of "lists of A", of "binary trees of A" (each node has two children), and of "infinitary trees of A" (each node has infinitely many children). \[ \begin{array}{l} \mathbb{N} \simeq 1 + \mathbb{N} \\ ListA \simeq 1 + (A \times ListA) \\ BinTreeA \simeq 1 + (A \times BinTreeA \times BinTreeA) \\ InfTreeA \simeq 1 + (A \times (\mathbb{N} \rightarrow InfTreeA)) \\ \end{array} \] Each recursive type has an associated induction principle.

In a language without (static) types, we can easily define a function for projecting out the second component of a pair: \[ f(x,y) = y \] What about in typed languages? There, we need to associate a type to such $f$. If the language has only a basic, primitive type system (as in C), we might have to resort to a family of functions: \[ \begin{array}{l} f_1 \ : \ \mathbb{N} \times \mathbb{B} \rightarrow \mathbb{B} \\ f_1 (x,y) = y \\ f_2 \ : \ \mathbb{N}^2 \times \mathbb{N} \rightarrow \mathbb{N} \\ f_2 (x,y) = y \\ f_3 \ : \ \mathbb{S} \times \mathbb{N}^3 \rightarrow \mathbb{N}^3 \\ f_3 (x,y) = y \\ \ldots \end{array} \] This is pretty inconvenient: we need to replicate the function on all the type combinations we are interested in. Further, if the set of types is infinite, this will only cover a finite amount of cases. A richer type system can instead allow for polymorphic types, allowing universal quantification over type variables \(A,B\): \[ \begin{array}{l} f \ : \ \forall A,B.\ A \times B \rightarrow B \\ f (x,y) = y \\ \end{array} \] Alternatively, two "type-valued" arguments can be explicitly passed, making the function a "dependent function": \[ \begin{array}{l} f \ : \ \prod_{A,B:{\sf Type}} A \times B \rightarrow B \\ f (A,B,x,y) = y \\ \end{array} \]

We have seen above that, given the type \( \forall A,B.\ A \times B \rightarrow B \) we can construct a program having that type: the program which projects the second component. Is there any non-equivalent program shring the same type? One could propose a polymorphic function like \[ f(a,b) = \begin{cases} a+1 & \mbox{if } A = \mathbb{N} \\ a & \mbox{otherwise} \end{cases} \] The above is called "ad hoc polymorphism". What if we instead forbid run-time type equality checks such as \( A = \mathbb{N} \), so that we get a "parametric" form of polymorphism? Is there any other program?

Is there any program of type \( \forall A.\ A \rightarrow A \) except the identity program ?

Let \( [A] \) stand for the type "finite lists of elements of \(A\)". Is it true that for any program of type \(\forall A.\ [A] \rightarrow [A] \), when run in the case \( A = \mathbb{N} \), doubling the numbers in the input list causes the numbers in the output list to be doubled?

The answer comes at a very low price.

Claim: Functional programming and the lambda calculus are only theoretical toys, with no practical relevance.

Counter-argument: A few decades ago, the claim might have been true. However, now lambda-abstractions found their way in many imperative languages: Java 8, Scala, C#, C++, JavaScript/ECMAScript, Python, Rust, ... Of course functional languages always had lambdas: Scheme, Lisp, Ocaml, Haskell, ... Many industries now search for functional programming skills, and even influential recruiters strongly recommend functional programming to be taught in courses.

Claim: Sophisticated type systems are purely theoretical.

Counter-argument:
The type system for the Girard-Reynolds polymorphic lambda calculus
is the basis for the so-called "generics", which are now found
in Java, Scala, C#, Rust, ... (and Haskell, Ocaml).
The extended System \(F\omega\) type system is the basis
for the "higher kinds" of Scala and Haskell.
Dependent types (as e.g. in Martin-Löf Intuitionistic Type Theory, or
Coquand Calculus of Constructions) are more complex and not
widely used, *yet*, beyond research languages (Coq, Agda).
Some simpler types, inspired from dependent types,
have begun to appear (e.g. GADTs in Haskell).

Home - Teaching - Type Theory

Roberto Zunino, 2016