hm-comparison.tex

\section{Comparison to Hindley-Milner type inference}

% TODO let-polymorphism comparison
Schwartzbach~\cite{schwartzbach1995polymorphic}
gives an accessible introduction to the tradeoffs of
global type inference
in the style of Milner~\cite{milner1978theory}.
%The use of a polymorphic let-construct

% TODO I thought Mairson was the source of these large benchmarks when I wrote this,
% but it's actually Kanellakis and Mitchell, POPL '89.
% I think a lot of the following can be removed and simplified
% to just talk about Examples 3.1, 3.4, and 3.4 from POPL '89.
% The test suite in lti-model has notes about all their behaviors with
% symbolic closures.

Related performance issues have infamously arisen in ML, summarized
by Mairson~\cite{Mairson:1989}, whose work confirmed
that the problem of ML type-checking is \textsc{DExpTime}-complete.
Several ML programs that exhibit exponential
growth in the size of principal type schemes of certain (pathological)
programs are provided as part of Mairson's initial investigation.
We use their benchmark from Appendix A~\cite{Mairson:1989} to stress-test our implementation of symbolic closures.

The benchmark uses Church pairs (an encoding of pairs using only lambdas)
to build an expression that has an enourmous principal function type,
both exponentially large in its size and number of type variables, relative
to the program size.
First, to demonstrate how the benchmark works, 
\figref{symbolic:figure:pair-benchmark}
presents a slightly smaller program size
($n=3$, where $n$ is the number of times size of the type duplicates).

%\begin{lstlisting}[language=ml]
%let fun pair x y = fn z => z x y in
%  let fun x1 y = pair y y in
%    let fun x2 y = x1(x1(y)) in (*@\label{symbolic:pair-x2:x2}@*)
%      x2(fn z => z)(*@\label{symbolic:pair-x2:x0}@*)
%    end
%  end
%end;
%val it = fn
%  : ((((?.X1 -> ?.X1) -> (?.X1 -> ?.X1) -> ?.X2) -> ?.X2)
%     -> (((?.X1 -> ?.X1) -> (?.X1 -> ?.X1) -> ?.X2) -> ?.X2) -> ?.X3)
%    -> ?.X3
%\end{lstlisting}

\begin{figure}
\begin{lstlisting}[language=ml, numbers=left]
let fun pair x y = fn z => z x y in
let fun x1 y = pair y y in(*@\label{symbolic:pair-x3:x1}@*)
  let fun x2 y = x1(x1(y)) in(*@\label{symbolic:pair-x3:x2}@*)
    let fun x3 y = x2(x2(y)) in(*@\label{symbolic:pair-x3:x3}@*)
      x3(fn z => z)(*@\label{symbolic:pair-x3:x0}@*)
    end
  end
end
end;
val it = fn : ((T -> T -> ?.X5) -> ?.X5)(*@\label{symbolic:pair-x3:it}@*)
    where T = ((U -> U -> ?.X4) -> ?.X4)
    where U = ((V -> V -> ?.X3) -> ?.X3)
    where V = ((W -> W -> ?.X2) -> ?.X2)(*@\label{symbolic:pair-x3:X2}@*)
    where W = (?.X1 -> ?.X1)(*@\label{symbolic:pair-x3:W}@*)
\end{lstlisting}
\caption{Benchmark demonstrating exponential growth of principal type schemes in ML.
  Types are manually abbreviated for readability, ungeneralized type variables are
  instantiated to dummy types (\sml{X1},\sml{X2},...).}
\label{symbolic:figure:pair-benchmark}
\end{figure}

The abbreviated output type \sml{it} (line \ref{symbolic:pair-x3:it}) is the result of the call to \sml{x3}
on line \ref{symbolic:pair-x3:x0}.
For each function $\text{\sml{x}}_n$, the range of its principal type is twice as large as
the range of the principle type of $\text{\sml{x}}_{n-1}$.
This pattern starts with \sml{x1} (line \ref{symbolic:pair-x3:x1}),
whose type is \sml{'a -> (('a -> 'a -> 'b) -> 'b)}---note that \sml{'a}
is duplicated, which provides a foundation for the exponential growth.
Then, \sml{x2} (line \ref{symbolic:pair-x3:x2})
has a principal type that is twice as large as \sml{x1}'s because it
iteratively calls \sml{x1} twice on its input---thus \sml{x2} duplicates its input \emph{four} times.
Extending this reasoning, \sml{x3} (line \ref{symbolic:pair-x3:x3})
has a principal type twice as large as \sml{x2}'s,
duplicating its input \emph{sixteen} times.
This is the number number of times \sml{W} (line \ref{symbolic:pair-x3:W},
the type of the argument \sml{fn z => z} on line \ref{symbolic:pair-x3:x0})
occurs in the output type (line \ref{symbolic:pair-x3:it}).

As mentioned, the number of type variables introduced is also exponential
in the size of the program. For $n=3$, $2^{n-1}=4$ variables are introduced---they are the types from
\sml{?.X2} onwards on lines \ref{symbolic:pair-x3:it}-\ref{symbolic:pair-x3:X2}.
For $n=4$, the output type will have \sml{W} occurring 32 times,
and introduce 8 type variables.
Mairson's original experiment with $n=5$ output reams of principal
types before crashing (circa 1989).
Repeating this experiment on a 2011 Macbook Pro with a 30-year-newer version of Standard ML
completed in 45 seconds (with around 20,000 lines of principal types).

% - parametric polymorphism
%  - (map f (map g l))
%    - "We are concerned with a conceptual framework in which these map functions may all be regarded semantically as the same object"
%      - Milner 1978;
%    - ie. in contrast to adhoc polymorphism
%      - where instead:
%         - (let [map1 ...
%                 map2 ...]
%             (map2 f (map1 g l)))
%    - Milner-style type inference (apparently?) applies to either case
%      - (compatibility with adhoc polymorphism seems tentative in 1978 paper)
%  - [?] type of a local function is the intersection of all the types it was checked at
%    - ordered intersection?
% - type schemes
%   - we don't attempt to infer type schemes
%     - (let [f (fn [y] y)] (f 1))
%     - in ML:
%       - infer f as (All [a] [a -> a]), via unification based constraint solving
%       -