Project on the Design Principles of Programming Languages (2020) at Peking University.
[toc]
We have spent nearly three months to complete and polish this project. We have rewritten most of the codes provided by TAPL [9], including 2500+ line additions and 1200+ line deletions. During the process of development, we read several interesting papers and learnt a lot from them.
- Renze Chen: Frontend
- Syntax/Parsing
- Type Inference
- Chengke Wang: Backend
- CPS Transformation
- Evaluation
More examples can be found in /examples
directory.
/* Output: 3 */
succ succ succ 0;
/* skip over the first two `succ` */
/* Output: 1 */
succ succ (shift k in succ 0);
/* skip over the second `succ` */
/* Output: 2 */
succ (reset succ (shift k in succ 0));
/*
succ_1 ... | [ null, null ]
reset ... | [ succ_1, null ]
succ_2 ... | [ null, succ_1 ]
shift ... | [ succ_2, succ_1 ]
succ_3 ... | [ null, succ_1]
1 | [ null, succ_1]
succ_1 1 | [ null, null ]
2
*/
/* resume the captured continuation */
/* Output: 3 */
succ (shift k in k (k (k 0)));
/* type modification: from Nat to Bool */
/* Output 1 */
if (reset succ (shift k in true)) then 1 else 2;
/*
if ... | [ null, null ]
reset (...) | [ if..., null ]
succ (shift k in true) | [ null, if... ]
shift k in true | [ succ, if... ]
true | [ null, if... ]
if true ... | [ null, null ]
1 | [ null, null ]
*/
/* Nondeterministic programming */
let (+) = fix add. a b.
if iszero a
then b
else add (pred a) (succ b)
in let (*) = fix mul. a b.
if iszero a
then 0
else b + (mul (pred a) b)
in let append = fix append. l1 l2.
lmatch l1 {
case nil => l2
case hd :: tl => hd :: append tl l2
}
in let choose = lambda l. shift k in (fix choose. l k.
lmatch l {
case nil => []
case hd :: tl => append (k hd) (choose tl k)
} ) l k
in reset
let a = choose [1, 2] in
let b = choose [3, 4] in
[[a + b * a, a * b + a]];
Suppose the input file is test.f
. Run the following command to build and run the interpreter.
./run.sh test.f
The make interface
target is for build *.cmi
and *.cmti
files. These files facilitate code completion and type inference in Merlin.
Delimited continuations were first introduced by Felleisen in 1988. It is a powerful tool for abstracting control. They can be used for implementing
- Generator function
- Nondeterministic programming
- Exception handling
- Automatic differentiation (AD) [8]
- Typed Printf [5]
However, many languages don't support the continuation manipulation, though they may provide a much weaker form like exception handing, generator function. Examples are Java, Ruby, Python. Although indeed there exist some languages that support it, either they are not typed (Scheme, Racket) or they impose strong restriction on the way using it (Standard ML, Delimcc Library of OCaml). The most common restriction is the answer type modification.
Without a type system, the programs written with continuation operators are error-prone and hard-to-debug. Unfortunately, It is not easy to implement delimited continuation operators in standard FP languages, especially in a typed setting. The reason is that the answer type modification feature and the context capturing are highly coupled with type system and the evaluator logic. It becomes less and less realistic to rewrite the core parts of a full-fledged typed language.
To overcome these problems, we design and implement a typed delimited continuation language. Specifically, our design targets and the main achievements are:
- We design a language that support delimited continuation operators (
reset
andshift
). We formalized its syntax, evaluation rules, as well as typing rules. - To ensure the language safety (or soundness), we implemented the type checker. A program that passes the check has the property of progress and preservation.
- We dispense with the need for modification with interpreters / evaluators. Instead, by first using CPS transformation to eliminate explicit continuation operators, we can execute the program in the standard language.
Figure A shows a common process for type checking and evaluating a program. This method suffers from the complexity of building a evaluator that supports reset / shift
operators, as discussed in the previous section, Their major difficulty is that we need to manage the evaluation context explicitly.
To avoid modifying the standard evaluator, we take a different approach as illustrated by Figure B. After type checking the program, we can first erase the type information (because of the erasure property) and then transform the original program into a new one that does not contain reset / shift
operators. Note that the transformation does not change the behavior of the program. In other words, the new and the old program must produce the same running output.
In this section, we are going to formalize the types and terms in the language.
We introduce the concept of "annotation". It annotates whether shift
operator is used during evaluation of the term. If shift
is used, we say the term is impure; otherwise, we treat it as a pure one. We will soon see in the next section that the distinguishment between pure and impure terms help the CPS transformer eliminate unnecessary transformation and produce better result.
Constraint-based Typing Rules with Answer Type and Purity Annotation
Abstract machine state:
-
$t$ : Term -
$C$ : Inner Context -
$K$ : Outer C(K)ontext
where
Suppose
If
then there must be
So the typing information of
Note that
The algorithmic evaluation rules above clearly shows the context/stack changing during evaluation, and can be easily used to implement a evaluation algorithm. However, since not every algorithmic evaluation rule stands for a reduction (some of them stand for stack-pushing or stack-popping), it's annoying to use them to formalize the properties like progress and preservation. Therefore, we introduce a set of syntactic evaluation rules here.
Firstly we define the syntactic form of context:
It's easy to show that:
- If
$t\rightarrow^* t'$ under declarative evaluation rules, then there's some$t''$ with$t\ |\ [\emptyset,\emptyset ]\rightarrow^* t''\ |\ [C,K]$ and $ (C\leadsto K)(t'')=t'$ under algorithmic evaluation rules. - If
$t\ |\ [C,K]\rightarrow^* t'\ |\ [C',K']$ under algorithmic evaluation rules, then$(C\leadsto K)(t)\rightarrow^* (C'\leadsto K')(t')$ under declarative evaluation rules.
type assumption
-
$\Gamma$ : type context -
$t$ : term -
$T$ : type of$t$ -
$R$ : answer-type before evaluating$t$ -
$S$ : answer-type after evaluating$t$ -
$a$ : purity-annotation of$t$ -
$TC$ : Type-Constraints that$t$ introduces -
$AC$ : Annotation-Constraints that$t$ introduces
PS:
PS:
[4] discuss the approach for CPS transform. The paper prove that "transforming \lambda-term into CPS can be expressed in one pass by moving administrative redexes to translation time in a context-free way. The translation is easily extended to the usual constructs of applicative oder functional languages and also to account for control operators". We followed its idea to transform the reset / shift
operators into CPS style. The sketch below presents the core part of our transformation rules:
cps[ something ]
=> lambda k_outer.
do_something (lambda result. k_outer result)
cps[ succ v ]
=> lambda k_outer.
cps[v] (lambda v. k_outer (succ v))
id = lambda v. v
cps[ reset e ]
=> cps[e] id
cps[ shift k in e ]
=> lambda k_outer.
let k = lambda v. lambda k'. k' (k_outer v) in
cps[e] id
In this section, we are going to prove the soundness of our proposed language. For simplicity, we here just use part of syntactic evaluation rules and part of weaken typing rules (without purity annotations and polymorphism). The full proofs is easy to get by extending our ones.
type assumption
-
$\Gamma$ : Type context -
$T$ : Type of$t$ -
$R$ : Answer type before$t$ 's evaluation -
$S$ : Answer type after$t$ 's evaluation -
$TC$ : Type-Constraints
PS: We may abbreviate
- If
$\Gamma\vdash x:T@[R,S]$ , then$R=S$ and$x:T\in\Gamma$ - If
$\Gamma\vdash (\lambda x.t_1):T@[R,S]$ , then$R=S$ and there's some$X,T_1,R_1,S_1$ with$\Gamma,x:X\vdash t_1:T_1@[R_1,S_1]$ and$T=X\rightarrow T_1@[R_1,S_1]$ - If
$\Gamma \vdash (t_1\ t_2):T@[R,S]$ , then there's some$U,V,T_2$ with$\Gamma \vdash t_1:(T_2\rightarrow T@[R,U])@[V,S]$ and$\Gamma\vdash t_2:T_2@[U,V]$ - If
$\Gamma\vdash (\text{shift}\ k\ \text{in}\ t_1):T@[R,S]$ , then there's some$Z,T_1$ with$\Gamma,k:(T\rightarrow R@[Z,Z])\vdash t_1:T_1@[T_1,S]$ - If
$\Gamma\vdash (\text{reset}\ t_1):T@[R,S]$ , then$R=S$ and there's some$T_1$ with$\Gamma\vdash t_1:T_1@[T_1,T]$
- If
$v$ is a value of type$Bool$ , then$v$ is either$true$ or$false$ . - If
$v$ is a value of type$X\rightarrow Y@[R,S]$ , then there's some$x,t_1$ with$v=\lambda x.t_1$
If
By induction on typing derivation(or
-
$t=t_1\ t_2$ : By lemma-1 and the induction hypothesis:- There's
$t_1'$ with$t_1\rightarrow t_1'$ , thus$t\rightarrow t_1'\ t_2$ - There's
$P,x,t_1'$ with$t_1=P[\text{shift}\ x\ \text{in}\ t_1']$ , thus$t=(P\ t_2)[\text{shift}\ x\ \text{in}\ t_1']$
- There's
-
$t=v_1\ t_2$ : By lemma-1 and the induction hypothesis:- There's
$t_2'$ with$t_2\rightarrow t_2'$ , thus$t\rightarrow v_1\ t_2'$ - There's
$P,x,t_2'$ with$t_2=P[\text{shift}\ x\ \text{in}\ t_2']$ , thus$t=(v_1\ P)[\text{shift}\ x\ \text{in}\ t_2']$
- There's
-
$t=v_1\ v_2$ : By lemma-1, there's$T_2,X_2$ with$\Gamma \vdash_p v_1:T_2\rightarrow T@[R,X_2]$ . And by lemma-2, there's$x,t_1$ with$v_1=\lambda x.t_1$ , thus$t\rightarrow [x\mapsto v_2]t_1$ -
$t=\text{reset}\ t_1$ : By lemma-1 and the induction hypothesis:- There's
$t_1'$ with$t_1\rightarrow t_1'$ , thus$t\rightarrow \text{reset}\ t_1'$ - There's
$P,x,t_1'$ with$t_1=P[\text{shift}\ x\ \text{in}\ t_1']$ , thus$t\rightarrow \text{reset}\ [x\mapsto \lambda v.\text{reset}\ P[v]]t_1'$
- There's
-
$t=\text{reset}\ v_1$ : Obviously$t\rightarrow v_1$ -
$t=\text{shift}\ x\ \text{in}\ t_1$ : Obviously$t=P[\text{shift}\ x\ \text{in}\ t_1]$ , where$P=[]$
If
By induction on
-
$t=z$ :- If
$z=x$ , then$X=T,\ R=S,\ [x\mapsto v]t=v$ , thus$\Gamma \vdash [x\mapsto v]t:T@[R,S]$ . - If
$z\ne x$ , then$[x\mapsto v]t=t$ , thus$\Gamma\vdash [x\mapsto v]t:T@[R,S]$ .
- If
-
$t=\lambda y.t_1$ :- If
$y=x$ , then$[x\mapsto v]t=t$ , thus$\Gamma \vdash [x\mapsto v]t:T@[R,S]$ - If
$y\ne x$ , then- By lemma-1,
$R=S$ and there's some$Y,T_1,R_1,S_1$ with-
$T=Y\rightarrow T_1 @[R_1,S_1]$ , -
$\Gamma,x:X,y:Y\vdash t_1:T_1@[R_1,S_1]$ (thus$\Gamma,y:Y,x:X\vdash t_1:T_1@[R_1,S_1]$ ), -
$\Gamma,y:Y\vdash_p v:X$ .
-
- By the induction hypothesis,
$\Gamma,y:Y\vdash [x\mapsto v]t_1:T_1@[R_1,S_1]$ . - By typing rules,
$\Gamma \vdash (\lambda y.[x\mapsto v]t_1):T@[R,S]$ , thus$\Gamma \vdash [x\mapsto v]t:T@[R,S]$ .
- By lemma-1,
- If
-
$t=t_1\ t_2$ : Trivially proved by the induction hypothesis, lemma-1 and typing rules. -
$t=\text{reset}\ t_1$ : Trivially proved by the induction hypothesis, lemma-1 and typing rules. -
$t=\text{shift}\ y\ \text{in}\ t_1$ : Similar to the case$t=\lambda y.t_1$
If
$\Gamma \vdash t:X@[Y,S]$ $\Gamma,x:X\vdash P[x]:T@[R,Y]$
By induction on
-
$P=[]$ : Easily proved by letting$X=T$ and$Y=R$ -
$P=P_1\ t_2$ :- By lemma-1, there's
$T_2,U,V$ with$\Gamma \vdash P_1[t]:(T_2\rightarrow T@[R,U])@[V,S]$ and$\Gamma\vdash t_2:T_2@[U,V]$ - By the induction hypothesis, there's some
$X,Y,x$ with$\Gamma\vdash t:X@[Y,S]$ and$\Gamma,x:X\vdash P_1[x]:(T_2\rightarrow T@[R,U])@[V,Y]$ (also obviously$\Gamma,x:X \vdash t_2:T_2@[U,V]$ ) - By typing rules,
$\Gamma,x:X\vdash P_1[x]\ t_2:T@[R,Y]$ , i.e.$\Gamma,x:X\vdash (P_1\ t_2)[x]:T@[R,Y]$ , thus$\Gamma ,x:X\vdash P[x]:T@[R,Y]$
- By lemma-1, there's
-
$P=v_1\ P_2$ : Similar to the case$P=P_1\ t_2$
If
By induction on evaluation rule (or
-
$t=t_1\ t_2$ or$t=v_1\ t_2$ : Trivially proved by the induction hypothesis, lemma-1 and typing rules. -
$t=(\lambda x.t_1)\ v_2$ : Simply proved by lemma-3, lemma-1 and typing rules. -
$t=\text{reset}\ t_1$ : By lemma-1,$R=S$ and there's some$T_1$ with$\Gamma \vdash t_1:T_1@[T_1,T]$ - If
$t_1=P[\text{shift}\ k\ \text{in}\ t_1']$ :- By lemma-4, there's some
$X,Y,x$ ($x$ is fresh) with-
$\Gamma\vdash (\text{shift}\ x\ \text{in}\ t_1'):X@[Y,T]$ , $\Gamma,x:X\vdash P[x]:T_1@[T_1,Y]$
-
- By lemma-1, there's some
$Z,T_1'$ with$\Gamma,k:(X\rightarrow Y@[Z,Z])\vdash t_1':T_1'@[T_1', T]$ - By typing rules,
$\Gamma\vdash_p(\lambda x.\text{reset}\ P[x]):(X\rightarrow Y@[Z,Z])$ - By lemma-3,
$\Gamma\vdash [k\mapsto (\lambda v. \text{reset}\ P[v])]t_1':T_1'@[T_1',T]$ - By typing rules,
$\Gamma\vdash (\text{reset}\ [k\mapsto (\lambda v. \text{reset}\ P[v])]t_1'):T@[R,S]$ , i.e.$\Gamma\vdash t':T@[R,S]$
- By lemma-4, there's some
- Else, trivially proved by the induction hypothesis, lemma-1 and typing rules.
- If
-
$t=\text{reset}\ v_1$ : By lemma-1,$R=S$ and$\Gamma\vdash_p v_1:T$ , thus$\Gamma \vdash t':T@[R,S]$
[1] Asai and Uehara. Selective CPS Transformation for Shift and Reset [PEPM '18]
[2] Asai and Kameyama. Polymorphic Delimited Continuations [APLAS '07]
[3] Asai and Kiselyov. Introduction to Programming with Shift and Reset [Tutorial]
[4] Danvy and Filinski. Representing Control: A Study of the CPS transformation ['91]
[5] Asai. On Typing Delimited Continuations: Three New Solutions to the Printf Problem ['07]
[6] Danvy and Filinski. A Functional Abstraction of Typed Context ['89]
[7] Koppel. Capturing the Future by Replaying the Past. Proc. ACM Program. Lang., Vol. 2, No. ICFP, Article 76 ['18]
[8] Fei Wang and Xilun Wu. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator ['19]
[9] Benjamin. Types and Programming Languages. The MIT Press